Here's my implementation of a bare-bones HashMap in Python. Expected behavior, it returns None if a key is missing, When a key is inserted the second time it just overwrites the first value.
class HashMap:
def __init__(self):
self.store = [None for _ in range(16)]
self.size = 0
def get(self, key):
key_hash = self._hash(key)
index = self._position(key_hash)
if not self.store[index]:
return None
else:
list_at_index = self.store[index]
for i in list_at_index:
if i.key == key:
return i.value
return None
def put(self, key, value):
p = Node(key, value)
key_hash = self._hash(key)
index = self._position(key_hash)
if not self.store[index]:
self.store[index] = [p]
self.size += 1
else:
list_at_index = self.store[index]
if p not in list_at_index:
list_at_index.append(p)
self.size += 1
else:
for i in list_at_index:
if i == p:
i.value = value
break
def __len__(self):
return self.size
def _hash(self, key):
if isinstance(key, int):
return key
result = 5381
for char in key:
result = 33 * result + ord(char)
return result
def _position(self, key_hash):
return key_hash % 15
class Node:
def __init__(self, key, value):
self.key = key
self.value = value
def __eq__(self, other):
return self.key == other.key
if __name__ == '__main__':
hashmap = HashMap()
hashmap.put(2, 12)
hashmap.put('asd', 13)
hashmap.put(2, 11)
print(hashmap.get(2))
print(hashmap.get('asd'))
print(hashmap.get('ade'))
Invite comments and reviews.
-
1\$\begingroup\$ I'd consider some hash functions described here: algs4.cs.princeton.edu/34hash \$\endgroup\$JaDogg– JaDogg2018年10月19日 09:11:43 +00:00Commented Oct 19, 2018 at 9:11
-
\$\begingroup\$ Also see softwareengineering.stackexchange.com/questions/49550/… for really good comparision of fast hashing algorithms \$\endgroup\$JaDogg– JaDogg2018年10月19日 09:36:59 +00:00Commented Oct 19, 2018 at 9:36
-
\$\begingroup\$ Why this hardcoded 15 in _position? \$\endgroup\$hartmut– hartmut2020年08月21日 19:05:46 +00:00Commented Aug 21, 2020 at 19:05
4 Answers 4
Initialization of
store
as a list ofNone
does not really buy you anything in terms of efficiency, but complicates the code quite a bit. I recommend to initialize it with empty lists instead.put
iterates the list twice. A single pass is sufficient:for i in list_at_index: if i == p: i.value = p.value return list_at_index.append(p) self.size += 1
store
has 16 slots, butkey_hash % 15
returns only 15 possible values. The last slot is never used.Absence of
delete
is somewhat questionable.
You can also collapse the loop in the get()
method using next()
replacing:
for i in list_at_index:
if i.key == key:
return i.value
return None
with:
return next((i.value for i in self.store[index] if i.key == key), None)
As a side note, I think it would be a good idea to avoid hardcoding the numbers like 5381
or 15
in the code and make them proper constants or hashmap initialization parameters.
Instead of rolling your own hash function _hash
you should use Python’s built-in hash()
.
-
\$\begingroup\$ I think he/she knows, because of the '-' before hash, but challenged herself to create the whole thing \$\endgroup\$Anonymous– Anonymous2020年05月19日 21:58:37 +00:00Commented May 19, 2020 at 21:58
You can define __getitem__
and __setitem__
instead of get
and put
to use square bracket notation (I don't think you would have to change the functions at all, just the names).