I've written a Ledger
class for a daemon process, which is supposed to keep track of an exchange's orderbook. My main concern is the way I keep track of the orders - more precisely, the containers I use to store them: a combination of set()
and dict()
appeared like the most feasible solution.
However, I'm wondering If haven't missed a sweet feature or nifty recipe that works better.
But first, my code - the lines in question are within the constructor:
import logging
from heapq import nlargest, nsmallest
log = logging.getLogger(__name__)
class OrderRemovalError(Exception):
pass
class Ledger:
def __init__(self):
self._ask_orders = {} # Keeps orders by price {price: (amount, count)}
self._ask_ladder = set() # keeps ask prices
self._bid_orders = {} # Keeps orders by price {price: (amount, count)}
self._bid_ladder = set() # keeps bid prices
def _add(self, order, ledger_orders, ledger_oders_ladder):
price, count, amount = order
price_key = str(price)
ledger_orders[price_key] = [amount, count]
ledger_oders_ladder.add(price)
return True
def _remove(self, order, ledger_orders, ledger_order_ladder):
price, amount, count = order
price_key = str(price)
err = []
c = 0
try:
ledger_orders.pop(price_key)
except KeyError:
c += 1
except Exception as e:
err.append(e)
try:
ledger_order_ladder.remove(price)
except ValueError:
c += 1
except Exception as e:
err.append(e)
if c == 2:
pass # order didn't exist.
return False
elif c == 1 or len(err) > 1:
raise OrderRemovalError("Something went wrong during deletion "
"of order %s! Error: %s" % ((*order,), err))
elif c == 0:
log.debug("Order deleted successfully! %s" % (order,))
return True
else:
log.error("Ledger._remove() Error Dump: "
"{order: %s, ledger_orders: %s, ledger_order_ladder: %s",
((*order,), ledger_orders, ledger_order_ladder))
raise ValueError("Passed order caused unforeseen result during "
"removal! Inputs have been logged.")
def bids(self, n=None):
"""
Returns n bids, returning best first; default returns all.
"""
if n is None:
bids = []
keys = [str(k) for k in sorted(self._bid_ladder, reverse=True)]
for key in keys:
bids.append((key, *self._bid_orders[key]))
return bids
elif n == 1:
# return best bid
return self._bid_orders[str(max(self._bid_ladder))]
elif n > 1:
#return nlargest
return self._bid_orders[str(nlargest(n, self._bid_ladder))]
def asks(self, n=None):
"""
Returns n asks, returning best first; default returns all.
"""
if n is None:
asks = []
keys = [str(k) for k in sorted(self._ask_ladder)]
for key in keys:
asks.append((key, *self._ask_orders[key]))
return asks
elif n == 1:
# return best bid
return self._ask_orders[str(min(self._ask_ladder))]
elif n > 1:
#return nsmallest
return self._ask_orders[nsmallest(n, self._ask_ladder)]
def remove_bid(self, order):
"""
Removes bid order if it exists in our ledger
:param order:
:return: True if it was removed, False if it didn't exist
"""
try:
return self._remove(order, self._bid_orders, self._bid_ladder)
except OrderRemovalError as e:
log.error(e)
def remove_ask(self, order):
"""
Removes ask order if it exists in our ledger
:param order:
:return: True if it was removed, False if it didn't exist
"""
try:
return self._remove(order, self._ask_orders, self._ask_ladder)
except OrderRemovalError as e:
log.error(e)
def add_bid(self, order):
return self._add(order, self._bid_orders, self._bid_ladder)
def add_ask(self, order):
return self._add(order, self._ask_orders, self._ask_ladder)
if __name__ == '__main__':
l = Ledger()
l.add_ask((500, 5.5, 5))
l.add_ask((501, 1.0, 4))
l.add_ask((502, 5.4, 70))
l.add_bid((500, 5.5, 5))
l.add_bid((501, 1.0, 4))
l.add_bid((502, 5.4, 70))
print(l.asks())
print(l.bids())
# Test updating
print("Testing update..")
l.add_bid((502, 5.4, 10))
l.add_ask((502, 5.4, 10))
print(l.asks())
print(l.bids())
# Test removing
print("Testing removing..")
l.remove_bid((501, 6, 0))
l.remove_ask((501, 6, 0))
print(l.asks())
print(l.bids())
print("Testing update..")
The main concern is performance - I will be making thousands of calls to this class, so I'd like to make the adding, removing and querying as fast as possible.
Some implementations I've already considered:
- Using the
itemgetter
from theoperator
module in combination withdict()
s only, eliminating the need forset()
s.
I didn't like this, because I had to check the entire dict by calling the index of each key (if I stored orders like so: {'price': [price, amount, count]}
). Keeping a set of dict keys seemed simpler and faster.
- Using
dict().keys()
as substitute forset()
I did consider this, but wasn't sure on how well this would perform, given the frequent calls it would receive. Could someone elaborate on how much better this would perform ? The most obvious advantage would be that it only calls on a single object (dict()
) instead of two (dict()
and set()
). The disadvantage appears to be to have to call dict.keys()
every time I want to return values from it (since I need to sort them first) - instead of just sorting the set()
.
- Using
namedtuple
instead ofdict()
I considered this at first, but since they're immutable, creating a new namedtuple
instance on every update seemed redundant and wasteful.
Those were my considerations. Please do feel free to correct any of these assumptions - I've been coding for just under a year professionally, so my experience is limited. Also, pointers where I can improve my code style are welcome as well.
1 Answer 1
Some people like to keep their code DRY rather than WET, I would recomend that you KISS.
If we look at The Zen of Python, you'll see:
Simple is better than complex.
What I'm trying to say is your code is hard to read, and you should definitely not start optimising from this base. As building on a base of sand is always going to lead to structural imperfections. And then one day it'll fail, and you'll need to start afresh.
So lets first look at how you can improve your base.
bids
andasks
change three things, and so you could merge these.bid
andask
have the same interface. Make a class for it.self._bid_ladder
is the equivalent ofself._bin_orders.keys()
.- Both
_add
and_remove
can be changed to be a couple of lines long by removing the 'ladders'. bid
andask
would make more sense if you allowed the new class to have the__getitem__
special method.
And so honestly I'd entirely re-write this.
Without __getitem__
you should be able to get to the following on your own:
class Orders:
def __init__(self, reverse=False):
self._orders = {}
self._reverse = reverse
def add(self, order):
self._orders[order[0]] = order
return True
def remove(self, price):
try:
self._orders.pop(price)
return True
except KeyError:
return False
class Ledger:
def __init__(self):
self.ask = Orders()
self.bid = Orders(True)
However as __getitem__
can take both an int
and a slice
, you'll have to take that into account.
You don't need heapq
and your usage of it was completely broken as you get a list back not an item.
You can do everything with sorted
and itertools.islice
.
And makes things simpler.
You want to:
- Sort the orders' dictionary.
- If is instance of
int
:- Perform an
islice
of one item. - Return the dict's item.
- Perform an
- If the key isn't an instance of
slice
:- Raise
TypeError
.
- Raise
- Otherwise perform an
islice
of the slices start, stop and step. - Return the sliced orders.
Or in Python:
def __getitem__(self, key):
keys = sorted(self._orders.keys(), reverse=self._reverse)
if isinstance(key, int):
key, = islice(keys, key, key + 1)
return self._orders[key]
elif not isinstance(key, slice):
raise TypeError()
return [
self._orders[key]
for key in islice(keys, key.start, key.stop, key.step)
]
Now that you've got a simpler base you can start improving performance.
If I were to want to improve the performance I'd change the dict that we use to a self made 'SortedDict'. Python already has an OrderedDict, and you can adapt the source-code for your needs.
This is as the only performance sink is the sorted
in __getitem__
.
But honestly I don't know if it's worth it.
-
\$\begingroup\$ Yea, I tend to not think about simplicity when I start, and so I always try to apply dry afterwards, which leaves me with code like this. Thanks for the pointers! \$\endgroup\$deepbrook– deepbrook2016年07月10日 13:36:14 +00:00Commented Jul 10, 2016 at 13:36
Explore related questions
See similar questions with these tags.