How can I optimize my insert function for the cuckoo hash table?

Question 1

I'm working on class for a hash table where collisions are corrected using cuckoo hashing, I had difficulty writing the insert function, the function is supposed to insert the key if the h(key) is available, if not it is supposed to find the shortest "path" for moving other keys in order to find a place for the inserted key, I'm supposed to have maximum maxdepth moves. Is there any way to improve my code?

class Cuckoo:
 def __init__(self,funcs,tableSize,maxDepth):
 self.table = [None]*tableSize
 self.funcs = funcs
 self.excess = []
 self.maxDepth = maxDepth
 self.size = tableSize
 def insert(self,key):
 if Cuckoo.find(self, key) != None: # if key exists
 return None
 # key doesn't exist
 min_insert_way = Cuckoo.insert2(self, key, [], 0)
 if min_insert_way[0] == -1: # if we can't commit max 3 replacement
 self.excess.append(key)
 return -1
 # we can commit max 3 replacement
 i = len(min_insert_way) - 1
 orginal_place = self.funcs[min_insert_way[-1]] (key,self.size) # it is the index of key in which it entered
 while ( i>=0 ):
 function_index = min_insert_way[i]
 replaced_element = self.table[self.funcs[function_index](key, self.size)]
 self.table[self.funcs[function_index](key, self.size)] = key # we replace he replaced element with key
 key = replaced_element
 i = i - 1
 return orginal_place
 def insert2(self, key, way, displacements):
 '''
 the function get 4 parameters:
 1. self == the hash table
 2. key
 3. way which is in the begging: way == []
 4. displacements == how many self.funcs have used
 the function return the way in which the insert will be. "way" is list of indexes of the self.funcs in which we do the insert
 for exmple, if way == [2,3], we will put "key" in self.table[self.funcs[3]] instead of "replaced_element".
 then, we will put "replaced_element" in self.table[self.funcs[2]].
 as you can see, the indexes are from the right to the left.
 if we have to use self.excess: way == [-1, X(1), X(1), ... X(max_depth)]
 '''
 if (displacements > self.maxDepth): # if we couldn't find max 3 displacements 
 return [-1]
 for i in range(0, len(self.funcs)):
 if self.table[self.funcs[i](key,self.size)] == None: # if one of the functions gives empty place
 return [i]
 # there is no an empty place
 replaced_element = self.table[self.funcs[0](key,self.size)] # "replaced_element" will be replaced by "key"
 min_way = Cuckoo.insert2(self, replaced_element, way, displacements+1)
 min_way.append(0) # because we have used self.funcs[0]
 for i in range (1,len(self.funcs)):
 replaced_element = self.table[self.funcs[i](key,self.size)] # "replaced_element" will be replaced by "key"
 current_way = Cuckoo.insert2(self, replaced_element, way, displacements+1)
 current_way.append(i) # because we have used h(i)
 if len(current_way) < len(min_way):
 min_way = current_way
 return min_way

Question 2

You keep saying "supposed to" - have you tested it? Does it actually work?

Question 3

@jonsharp yes it is working

Question 4

For one thing, I would suggest using a better name for your helper-function, that describes what it does:

def _find_insert_path(...):

Note the leading underscore - this is Python convention for "don't call this directly from outside the class". Users should call insert, which doesn't have the underscore.

Second, you could use exceptions to do handle the case where the insertion depth is exceeded more cleanly. Alter _find_insert_path to do:

if displacements > self.maxDepth:
 raise KeyError("Maximum insertion path length exceeded.")

Which makes insert use:

try:
 min_insert_way = self._find_insert_path(key, [], 0)
except KeyError:
 self.excess.append(key)
 return -1
...

Note the call to _find_insert_path; this is an instance method, you don't need to call it on the class and pass the instance Cuckoo.method(self, ...) you can call it directly on the instance self.method(...).

This:

for i in range(0, len(self.funcs)):
 if self.table[self.funcs[i](key,self.size)] == None:
 return [i]

Is a pretty awkward way of looping over self.funcs, not least because the start argument to range already defaults to 0. Instead, use enumerate:

for i, func in enumerate(self.funcs):
 if self.table[func(key, self.size)] is None:
 return [i]

Note that I am testing for the singleton None with identity (is) not equality (==) per PEP-0008.

Your loop over min_insert_way is similarly awkward; creating and incrementing your own index is usually a sign you're doing something wrong. You could instead use:

for function_index in reversed(min_insert_way):

Fianlly, it seems odd that your helper function returns a list of indices for self.funcs. Instead, why not have it return the list of indices within self.table to which each item in turn is moved? Then you don't have to call the hash functions at all in insert, which becomes much simpler:

def insert(self, key):
 if self.find(key):
 return None
 try:
 path = self._find_insert_path(key, [], 0)
 except KeyError:
 self.excess.append(key)
 return -1
 for index in reversed(path):
 key, self.table[index] = self.table[index], key
 return path[-1]

Note the parallel assignment of key and the value at self.table[index]; this uses tuple packing and unpacking to replace three lines of code with one and remove the temporary variable replaced_element.

This will also make the loop I mentioned above even simpler; you don't even need enumerate:

for func in self.funcs:
 i = func(key, self.size)
 if self.table[i] is None:
 return [i]

You could also make it neater using default arguments:

def _find_insert_path(self, key, path=None, depth=0):
 if path is None:
 path = []
 ...

Now the call from insert is simply:

path = self._find_insert_path(key)

jonrsharpe jonrsharpe 14k2 gold badges36 silver badges62 bronze badges · Answer 1 · 2014-05-31 07:36:34Z

For one thing, I would suggest using a better name for your helper-function, that describes what it does:

def _find_insert_path(...):

Note the leading underscore - this is Python convention for "don't call this directly from outside the class". Users should call insert, which doesn't have the underscore.

Second, you could use exceptions to do handle the case where the insertion depth is exceeded more cleanly. Alter _find_insert_path to do:

if displacements > self.maxDepth:
 raise KeyError("Maximum insertion path length exceeded.")

Which makes insert use:

try:
 min_insert_way = self._find_insert_path(key, [], 0)
except KeyError:
 self.excess.append(key)
 return -1
...

Note the call to _find_insert_path; this is an instance method, you don't need to call it on the class and pass the instance Cuckoo.method(self, ...) you can call it directly on the instance self.method(...).

This:

for i in range(0, len(self.funcs)):
 if self.table[self.funcs[i](key,self.size)] == None:
 return [i]

Is a pretty awkward way of looping over self.funcs, not least because the start argument to range already defaults to 0. Instead, use enumerate:

for i, func in enumerate(self.funcs):
 if self.table[func(key, self.size)] is None:
 return [i]

Note that I am testing for the singleton None with identity (is) not equality (==) per PEP-0008.

Your loop over min_insert_way is similarly awkward; creating and incrementing your own index is usually a sign you're doing something wrong. You could instead use:

for function_index in reversed(min_insert_way):

Fianlly, it seems odd that your helper function returns a list of indices for self.funcs. Instead, why not have it return the list of indices within self.table to which each item in turn is moved? Then you don't have to call the hash functions at all in insert, which becomes much simpler:

def insert(self, key):
 if self.find(key):
 return None
 try:
 path = self._find_insert_path(key, [], 0)
 except KeyError:
 self.excess.append(key)
 return -1
 for index in reversed(path):
 key, self.table[index] = self.table[index], key
 return path[-1]

Note the parallel assignment of key and the value at self.table[index]; this uses tuple packing and unpacking to replace three lines of code with one and remove the temporary variable replaced_element.

This will also make the loop I mentioned above even simpler; you don't even need enumerate:

for func in self.funcs:
 i = func(key, self.size)
 if self.table[i] is None:
 return [i]

You could also make it neater using default arguments:

def _find_insert_path(self, key, path=None, depth=0):
 if path is None:
 path = []
 ...

Now the call from insert is simply:

path = self._find_insert_path(key)

Stack Exchange Network

How can I optimize my insert function for the cuckoo hash table?

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

How can I optimize my insert function for the cuckoo hash table?

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions