I attempted making a hashmap in python, and it was harder, due to some limitations but this is my version of dictionaries in python. Are there any way to simplify or do the same thing in less code along with tips and tricks?
class HashMap:
def __init__(self, memory): # refers to the length of the bucket
self.data = [None] * memory
self.memory = memory
def _hash(self, key):
hashed_value = 0
bucket_length = self.memory
string_length = len(key)
i = 0
while i < string_length:
hashed_value += (ord(key[i]) * i) % bucket_length
if hashed_value > bucket_length-1:
hashed_value %= bucket_length-1
i += 1
return hashed_value
def set(self, key, value):
address = self._hash(key)
bucket = self.data[address]
if not bucket:
self.data[address] = [key, value, None] # None refers to next pointer
else:
while bucket[2] != None:
bucket = bucket[2]
bucket[2] = [key, value, None]
def get(self, key):
address = self._hash(key)
bucket = self.data[address]
if bucket:
while bucket[2] != None or key != bucket[0]:
bucket = bucket[2]
if bucket:
return bucket[1]
raise KeyError
def keys(self):
keys_list = []
bucket_list = self.data
for bucket in bucket_list:
current_bucket = bucket
if bucket:
while current_bucket != None:
keys_list.append(current_bucket[0])
current_bucket = current_bucket[2]
return keys_list
2 Answers 2
Your implementation of get
is wrong. The code is:
def get(self, key):
address = self._hash(key)
bucket = self.data[address]
if bucket:
while bucket[2] != None or key != bucket[0]:
bucket = bucket[2]
if bucket:
return bucket[1]
raise KeyError
The line while bucket[2] != None or key != bucket[0]
says "keep traversing the link list as long as it's possible to do so, and if it's impossible, try to do it anyway if the key is wrong". Because of the boolean or
, the condition bucket[2] != None
means the loop will always step forward in the linked list if it's possible to do so - even if the current key is correct. On top of that, once the loop gets to the last element, if the key at that position does not match the given key, the loop will attempt to iterate once more, giving us:
TypeError Traceback (most recent call last)
<ipython-input-7-a5939dc0e83e> in <module>()
----> 1 h.get("apple")
<ipython-input-1-4777e6d3506b> in get(self, key)
31 bucket = self.data[address]
32 if bucket:
---> 33 while bucket[2] != None or key != bucket[0]:
34 bucket = bucket[2]
35 if bucket:
TypeError: 'NoneType' object is not subscriptable
The result is get
fails with this error in every case except when the requested key is the last one in its slot.
The correct condition is of course while bucket[2] != None and key != bucket[0]
. We then need to check afterwards that we got out of the loop because we found the right key, not because we ran out of buckets, giving us the implementation:
def get(self, key):
address = self._hash(key)
bucket = self.data[address]
if bucket:
while bucket[2] != None and key != bucket[0]:
bucket = bucket[2]
if bucket[0] == key:
return bucket[1]
raise KeyError
-
\$\begingroup\$ ahhh, that is very true, i couldn't test for these conditions and just assumed it works based on my first testcases, since generating the same memory place by hash, seemed very hard \$\endgroup\$Anonymous– Anonymous2020年05月20日 17:17:10 +00:00Commented May 20, 2020 at 17:17
-
1\$\begingroup\$ @MahdeenSky You can test it by setting the memory to 1 when you create the hashmap \$\endgroup\$Jack M– Jack M2020年05月20日 17:18:20 +00:00Commented May 20, 2020 at 17:18
-
\$\begingroup\$ oh ill keep it mind, ty for taking your time fixing a mistake of mine, when it doesn't work as expected, since people tend to report the thread and downvote it for not working, on the other hand, people like you are what this stackexchange needs!! \$\endgroup\$Anonymous– Anonymous2020年05月20日 17:30:48 +00:00Commented May 20, 2020 at 17:30
Your implementation of set
is also wrong. If you set the same key twice, the first value should be overwritten. Instead, the new key-value pair is added to the end of the bucket list, so memory usage will increase, the keys
method will return multiple copies of the same key.
-
\$\begingroup\$ basically i should add change the condition of while bucket != None to while bucket != None and key != bucket[0] \$\endgroup\$Anonymous– Anonymous2020年05月20日 17:14:46 +00:00Commented May 20, 2020 at 17:14
-
\$\begingroup\$ ... followed by doing different things depending on whether or not
key
was found. If you simply changed the condition, you'd lose any subsequent key-value pairs that were added to thatbucket
after thatkey
. \$\endgroup\$AJNeufeld– AJNeufeld2020年05月20日 17:21:29 +00:00Commented May 20, 2020 at 17:21 -
\$\begingroup\$ oh a rather good point, so i basically only change the None if it loops through the whole keys in the memory, and change the value only if it isn't the above \$\endgroup\$Anonymous– Anonymous2020年05月20日 17:25:38 +00:00Commented May 20, 2020 at 17:25
-
\$\begingroup\$ Once you've fixed your code, you should post a follow-up question with the updated code, complete with test cases you've used test the new implementation. There is much more that can be improved in your implementation, but with an accepted answer, it is unlikely you will not garner any more feedback on this question. \$\endgroup\$AJNeufeld– AJNeufeld2020年05月20日 17:36:07 +00:00Commented May 20, 2020 at 17:36
Explore related questions
See similar questions with these tags.
%
with anything other than the size is usually wrong. And> bucket_length - 1
? What about>= bucket_length
? What's so special aboutbucket[2]
? \$\endgroup\$