I'm working on class for a hash table where collisions are corrected using cuckoo hashing, I had difficulty writing the insert function, the function is supposed to insert the key if the h(key)
is available, if not it is supposed to find the shortest "path" for moving other keys in order to find a place for the inserted key, I'm supposed to have maximum maxdepth
moves.
Is there any way to improve my code?
class Cuckoo:
def __init__(self,funcs,tableSize,maxDepth):
self.table = [None]*tableSize
self.funcs = funcs
self.excess = []
self.maxDepth = maxDepth
self.size = tableSize
def insert(self,key):
if Cuckoo.find(self, key) != None: # if key exists
return None
# key doesn't exist
min_insert_way = Cuckoo.insert2(self, key, [], 0)
if min_insert_way[0] == -1: # if we can't commit max 3 replacement
self.excess.append(key)
return -1
# we can commit max 3 replacement
i = len(min_insert_way) - 1
orginal_place = self.funcs[min_insert_way[-1]] (key,self.size) # it is the index of key in which it entered
while ( i>=0 ):
function_index = min_insert_way[i]
replaced_element = self.table[self.funcs[function_index](key, self.size)]
self.table[self.funcs[function_index](key, self.size)] = key # we replace he replaced element with key
key = replaced_element
i = i - 1
return orginal_place
def insert2(self, key, way, displacements):
'''
the function get 4 parameters:
1. self == the hash table
2. key
3. way which is in the begging: way == []
4. displacements == how many self.funcs have used
the function return the way in which the insert will be. "way" is list of indexes of the self.funcs in which we do the insert
for exmple, if way == [2,3], we will put "key" in self.table[self.funcs[3]] instead of "replaced_element".
then, we will put "replaced_element" in self.table[self.funcs[2]].
as you can see, the indexes are from the right to the left.
if we have to use self.excess: way == [-1, X(1), X(1), ... X(max_depth)]
'''
if (displacements > self.maxDepth): # if we couldn't find max 3 displacements
return [-1]
for i in range(0, len(self.funcs)):
if self.table[self.funcs[i](key,self.size)] == None: # if one of the functions gives empty place
return [i]
# there is no an empty place
replaced_element = self.table[self.funcs[0](key,self.size)] # "replaced_element" will be replaced by "key"
min_way = Cuckoo.insert2(self, replaced_element, way, displacements+1)
min_way.append(0) # because we have used self.funcs[0]
for i in range (1,len(self.funcs)):
replaced_element = self.table[self.funcs[i](key,self.size)] # "replaced_element" will be replaced by "key"
current_way = Cuckoo.insert2(self, replaced_element, way, displacements+1)
current_way.append(i) # because we have used h(i)
if len(current_way) < len(min_way):
min_way = current_way
return min_way
-
\$\begingroup\$ You keep saying "supposed to" - have you tested it? Does it actually work? \$\endgroup\$jonrsharpe– jonrsharpe2014年05月31日 06:55:22 +00:00Commented May 31, 2014 at 6:55
-
\$\begingroup\$ @jonsharp yes it is working \$\endgroup\$user3369309– user33693092014年05月31日 07:07:29 +00:00Commented May 31, 2014 at 7:07
1 Answer 1
For one thing, I would suggest using a better name for your helper-function, that describes what it does:
def _find_insert_path(...):
Note the leading underscore - this is Python convention for "don't call this directly from outside the class". Users should call insert
, which doesn't have the underscore.
Second, you could use exceptions to do handle the case where the insertion depth is exceeded more cleanly. Alter _find_insert_path
to do:
if displacements > self.maxDepth:
raise KeyError("Maximum insertion path length exceeded.")
Which makes insert
use:
try:
min_insert_way = self._find_insert_path(key, [], 0)
except KeyError:
self.excess.append(key)
return -1
...
Note the call to _find_insert_path
; this is an instance method, you don't need to call it on the class and pass the instance Cuckoo.method(self, ...)
you can call it directly on the instance self.method(...)
.
This:
for i in range(0, len(self.funcs)):
if self.table[self.funcs[i](key,self.size)] == None:
return [i]
Is a pretty awkward way of looping over self.funcs
, not least because the start
argument to range
already defaults to 0
. Instead, use enumerate
:
for i, func in enumerate(self.funcs):
if self.table[func(key, self.size)] is None:
return [i]
Note that I am testing for the singleton None
with identity (is
) not equality (==
) per PEP-0008.
Your loop over min_insert_way
is similarly awkward; creating and incrementing your own index is usually a sign you're doing something wrong. You could instead use:
for function_index in reversed(min_insert_way):
Fianlly, it seems odd that your helper function returns a list of indices for self.funcs
. Instead, why not have it return the list of indices within self.table
to which each item in turn is moved? Then you don't have to call the hash functions at all in insert
, which becomes much simpler:
def insert(self, key):
if self.find(key):
return None
try:
path = self._find_insert_path(key, [], 0)
except KeyError:
self.excess.append(key)
return -1
for index in reversed(path):
key, self.table[index] = self.table[index], key
return path[-1]
Note the parallel assignment of key
and the value at self.table[index]
; this uses tuple packing and unpacking to replace three lines of code with one and remove the temporary variable replaced_element
.
This will also make the loop I mentioned above even simpler; you don't even need enumerate
:
for func in self.funcs:
i = func(key, self.size)
if self.table[i] is None:
return [i]
You could also make it neater using default arguments:
def _find_insert_path(self, key, path=None, depth=0):
if path is None:
path = []
...
Now the call from insert
is simply:
path = self._find_insert_path(key)
Explore related questions
See similar questions with these tags.