I've written a simple LRU cache class and I am trying to make it thread-safe. My thoughts are that I just need to wrap the code that updates the ordered dict in a lock so that if any thread is writing to the ordered dict, all other writes/reads need to wait. Is that correct?
Also, what is the best way to test that something is thread-safe?
import threading
from collections import OrderedDict
class Cache:
def __init__(self, capacity):
self._count = 0
self._capacity = capacity
self._store = OrderedDict()
self._lock = threading.Lock()
def count(self):
return self._count
def capacity(self):
return self._capacity
def get(self, key):
if key in self._store:
self._store.move_to_end(key)
return self._store[key]
def set(self, key, value):
with self._lock: # this was the only place i thought that needed locking
if key not in self._store:
if self._count >= self._capacity:
self._store.popitem(last=False)
self._count -= 1
self._store[key] = value
self._count += 1
else:
self._store[key] = value
self._store.move_to_end(key)
def containsKey(self, key):
return key in self._store
-
1\$\begingroup\$ Welcome to Code Review! I changed the title so that it describes what the code does per site goals: "State what your code does in your title, not your main concerns about it.". Please check that I haven't misrepresented your code, and correct it if I have. \$\endgroup\$Toby Speight– Toby Speight2021年06月13日 19:57:37 +00:00Commented Jun 13, 2021 at 19:57
2 Answers 2
The use of self._store
isn't thread-safe, because we are only locking its writers. We need to ensure that readers don't see it in an inconsistent state, too.
-
\$\begingroup\$ Thanks Toby!! So you're saying that it's possible that 1 thread may be writing, but since I'm not locking any reads, it's possible for another thread to get an older value. So say the cache has v1. A write happens at t2 for value v2, but a read could potentially happen at t3 that still gets v1? \$\endgroup\$sleepyowl– sleepyowl2021年06月13日 20:14:23 +00:00Commented Jun 13, 2021 at 20:14
-
\$\begingroup\$ I guess i could test it, i just learned how to create threads! :) Sounds like I could make a write take long while a read thread goes and see what I'm getting back. \$\endgroup\$sleepyowl– sleepyowl2021年06月13日 20:20:31 +00:00Commented Jun 13, 2021 at 20:20
-
\$\begingroup\$ Yes, it's possible that one thread could be in
get()
while another is inset()
, and that would be bad, sinceget()
doesn't take the lock. Automatically testing for race conditions is hard; I don't know of any means other than hammering the code with many threads and checking for inconsistencies - but that's only a stochastic approach, and it can't prove the code correct. \$\endgroup\$Toby Speight– Toby Speight2021年06月13日 20:25:01 +00:00Commented Jun 13, 2021 at 20:25
I've experienced that key in self._store
is not a thread-safe way to check if a key exists in a dictionary, even when wrapped in a with lock
.
Replacing key in self._store
with key in self._store.keys()
, and wrapping all accessors and setters on self._store
with with self._lock
should be sufficient.