Simulating memcache get and set race condition and solving it with add

Question 1

Memcache get and set can hit a race condition if you use them together to achieve something like locking because it is not atomic. A Simple memcache locking logic using get and set would look like:

def access(resource):
 if memcache.get(resource):
 return "Access Denied"
 else:
 memcache.set(resource, timestamp)
 return "Access Granted and Lock Obtained"

If this code is running in multi-threaded web applications, you can have a scenario where two get can happen together. Hence both will get the access the resource.

add doesn't get into this problem because it just sets the key only if the key does not exists. It also makes only a single roundtrip and hence it is atomic by default.

The same API with add would look like:

def access(resource):
 if memcache.add(resource, timestamp):
 return "Access Granted and Lock Obtained"
 else:
 return "Acess Denied"

Now, I am writing a script to simulate this race condition while using get , set and prove that using add solves the concurrent access problem with this naive memcache lock implementation. I am using files as my resource here. Multiple users can see the same file in the web application front-end and move it to another location if they get a lock to that file. However, sometimes get, set race condition happens giving access to the file for two users leading to file not found error for one of the users. Using add doesn't lead to this problem.

The script uses threading to simulate this scenario. If a file not found error happens then its considered a failure. I have two variants of the move function, one using get, set and another using add which can be used in the run method of the thread to demonstrate both functionalities.

Can anyone review this and tell me if this code is fine? A high level logical review is also fine if it is hard to set this up in your machine. One thing I notice is that the ratio of error is pretty high, but is this because I am actually doing threading which really is not random.

Pre-requisites: You need memcache module and memcached server running on port 5551 to run this. You also need a folder called archive to move the files.

import memcache
import signal
import sys
import shutil
import os
import threading
import time
mc = memcache.Client(['127.0.0.1:5551'], debug=0)
cwd = os.getcwd()
error = 0
processed = 0
def touch(fname, times=None):
 with open(fname, 'a'):
 os.utime(fname, times)
def signal_handler(signal, frame):
 print('You pressed Ctrl+C!')
 sys.exit(0)
signal.signal(signal.SIGINT, signal_handler)
print("Ctrl-C exits")
def create_files():
 for fname in range(1,11):
 touch(str(fname) + ".txt")
def move_files_get_set():
 global error
 global processed
 files = [file for file in os.listdir(".") if file.endswith(".txt")]
 for file in files:
 if os.path.isfile(file) and not mc.get(file):
 processed = processed + 1 
 mc.set(file, int(round(time.time() * 1000)))
 try:
 shutil.move(os.path.join(cwd, file), os.path.join(cwd, "archive", "%s" % file))
 except Exception as e:
 print(e)
 processed = processed - 1
 error = error + 1
 print("%s errors happened with %s" % (error, processed))
 mc.set(file, None)
def move_files_add():
 global error
 global processed
 files = [file for file in os.listdir(".") if file.endswith(".txt")]
 for file in files:
 if os.path.isfile(file) and mc.add(file, int(round(time.time() * 1000))):
 processed = processed + 1 
 try:
 shutil.move(os.path.join(cwd, file), os.path.join(cwd, "archive", "%s" % file))
 except Exception as e:
 print(e)
 processed = processed - 1
 error = error + 1
 print("%s errors happened with %s" % (error, processed))
 mc.set(file, None)
class myThread (threading.Thread):
 def __init__(self, threadID, name, counter):
 threading.Thread.__init__(self)
 self.threadID = threadID
 self.name = name
 self.counter = counter
 def run(self):
 print "Starting " + self.name
 move_files_get_set()
 print "Exiting " + self.name
def trigger():
 while True:
 create_files()
 t1 = myThread("1", "Thread 1", "1")
 t2 = myThread("2", "Thread 2", "2")
 t1.start()
 t2.start()
 t1.join()
 t2.join()

Question 2

You have a race condition!

But not just the one you expected. Let's take out the 'beef' of the algorithm, and make something simpler:

import threading
processed = 0
def move_files_get_set():
 global processed
 for val in xrange(100000):
 processed = processed + 1
class myThread (threading.Thread):
 def __init__(self, threadID, name, counter):
 threading.Thread.__init__(self)
 self.threadID = threadID
 self.name = name
 self.counter = counter
 def run(self):
 print "Starting " + self.name
 move_files_get_set()
 print "Exiting " + self.name
def trigger():
 t1 = myThread("1", "Thread 1", "1")
 t2 = myThread("2", "Thread 2", "2")
 t1.start()
 t2.start()
 t1.join()
 t2.join()
 print processed
if __name__ == "__main__":
 trigger()

What I did was basically take out everything related to memcache and files.

Here's the output:

$ python base.py
Starting Thread 1
Starting Thread 2
Exiting Thread 1
Exiting Thread 2
123418
$ python base.py
Starting Thread 1
Starting Thread 2
Exiting Thread 2
Exiting Thread 1
182550

Another race condition

In move_files_add.

def move_files_add():
 global error
 global processed
 files = [file for file in os.listdir(".") if file.endswith(".txt")]
 for file in files:
 if os.path.isfile(file) and mc.add(file, int(round(time.time() * 1000))):
 processed = processed + 1 
 try:
 shutil.move(os.path.join(cwd, file), os.path.join(cwd, "archive", "%s" % file))
 except Exception as e:
 print(e)
 processed = processed - 1
 error = error + 1
 print("%s errors happened with %s" % (error, processed))
 mc.set(file, None)

What can happen is:

* p1: os.path.isfile(file) -> true
* p2: os.path.isfile(file) -> true
* p1: mc.add(file, bla) -> true
* p1: shutil.move() -> done
* p1: mc.set(file, None) -> done
* p2: mc.add(file, bla) -> true
* p2: shutil.move() -> KABOOM!

Moving files the correct way

When doing files, and there is a risk of race conditions, (almost) always use the following pattern:

try:
 os.rename(source, target)
except FileNotFoundError:
 # Oops, too late!

Now, os.rename has a problem, it only works on the same filesystem. So depending on your needs you could do something like:

try:
 os.rename(source, tempfile_in_same_dir_as_source)
except FileNotFoundError:
 # Oops, too late! No worries.
 pass
else:
 # Yes, it worked. Now we just need to make sure it's moved
 # to the right target location.
 shutil.move(tempfile_in_same_dir_as_source, target)

Question 3

Thanks for analyzing this one. Thread race itself can't be avoided but that is to simulate the real time scenario of users. So I need that :-) The second point is valid, I can bump into that scenario. I have enhanced the add function in this review question codereview.stackexchange.com/questions/131258/…. Do you see any races there? Solving using try: except is easy way out, so I want to avoid try: except as a challenge. Is there a theoretical way to avoid races, is that possible etc.

Question 4

I tried also looking at that one yesterday, but wemt to bed instead. Maybe I'll take a look later today. For now: why not consider compare-and-set. neopythonic.blogspot.nl/2011/08/… ? It's supposed to also be used for atomic mutations.

Question 5

I took a look. But compare-and-set can be used to implement compare-and-swap, which gives you a better ability to report the key of the process doing the handling.

Question 6

Read this as 3rd comment. Fixing a typo. Yeah, please have a look. I read that blog by GVR. It helped me understand race condition better but compare-and-set seemed to be for handling dirty state while doing get-set which different from my use case like say when you want to append that value to a new value etc. In my case add seemed to be betterv and the development team also pointed out some examples using add. More than data the avalibility or not of the key seemed more important.

Question 7

If compare-and-set is not needed, it's not needed. I already added answer to the other question.

score 1 · Accepted Answer · 2016-06-06 20:38:00Z

You have a race condition!

But not just the one you expected. Let's take out the 'beef' of the algorithm, and make something simpler:

import threading
processed = 0
def move_files_get_set():
 global processed
 for val in xrange(100000):
 processed = processed + 1
class myThread (threading.Thread):
 def __init__(self, threadID, name, counter):
 threading.Thread.__init__(self)
 self.threadID = threadID
 self.name = name
 self.counter = counter
 def run(self):
 print "Starting " + self.name
 move_files_get_set()
 print "Exiting " + self.name
def trigger():
 t1 = myThread("1", "Thread 1", "1")
 t2 = myThread("2", "Thread 2", "2")
 t1.start()
 t2.start()
 t1.join()
 t2.join()
 print processed
if __name__ == "__main__":
 trigger()

What I did was basically take out everything related to memcache and files.

Here's the output:

$ python base.py
Starting Thread 1
Starting Thread 2
Exiting Thread 1
Exiting Thread 2
123418
$ python base.py
Starting Thread 1
Starting Thread 2
Exiting Thread 2
Exiting Thread 1
182550

Another race condition

In move_files_add.

def move_files_add():
 global error
 global processed
 files = [file for file in os.listdir(".") if file.endswith(".txt")]
 for file in files:
 if os.path.isfile(file) and mc.add(file, int(round(time.time() * 1000))):
 processed = processed + 1 
 try:
 shutil.move(os.path.join(cwd, file), os.path.join(cwd, "archive", "%s" % file))
 except Exception as e:
 print(e)
 processed = processed - 1
 error = error + 1
 print("%s errors happened with %s" % (error, processed))
 mc.set(file, None)

What can happen is:

* p1: os.path.isfile(file) -> true
* p2: os.path.isfile(file) -> true
* p1: mc.add(file, bla) -> true
* p1: shutil.move() -> done
* p1: mc.set(file, None) -> done
* p2: mc.add(file, bla) -> true
* p2: shutil.move() -> KABOOM!

Moving files the correct way

When doing files, and there is a risk of race conditions, (almost) always use the following pattern:

try:
 os.rename(source, target)
except FileNotFoundError:
 # Oops, too late!

Now, os.rename has a problem, it only works on the same filesystem. So depending on your needs you could do something like:

try:
 os.rename(source, tempfile_in_same_dir_as_source)
except FileNotFoundError:
 # Oops, too late! No worries.
 pass
else:
 # Yes, it worked. Now we just need to make sure it's moved
 # to the right target location.
 shutil.move(tempfile_in_same_dir_as_source, target)

Thanks for analyzing this one. Thread race itself can't be avoided but that is to simulate the real time scenario of users. So I need that :-) The second point is valid, I can bump into that scenario. I have enhanced the add function in this review question codereview.stackexchange.com/questions/131258/…. Do you see any races there? Solving using try: except is easy way out, so I want to avoid try: except as a challenge. Is there a theoretical way to avoid races, is that possible etc.
I tried also looking at that one yesterday, but wemt to bed instead. Maybe I'll take a look later today. For now: why not consider compare-and-set. neopythonic.blogspot.nl/2011/08/… ? It's supposed to also be used for atomic mutations.
I took a look. But compare-and-set can be used to implement compare-and-swap, which gives you a better ability to report the key of the process doing the handling.
Read this as 3rd comment. Fixing a typo. Yeah, please have a look. I read that blog by GVR. It helped me understand race condition better but compare-and-set seemed to be for handling dirty state while doing get-set which different from my use case like say when you want to append that value to a new value etc. In my case add seemed to be betterv and the development team also pointed out some examples using add. More than data the avalibility or not of the key seemed more important.
If compare-and-set is not needed, it's not needed. I already added answer to the other question.

Stack Exchange Network

Simulating memcache get and set race condition and solving it with add

1 Answer 1

You have a race condition!

Another race condition

Moving files the correct way

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Simulating memcache get and set race condition and solving it with add

1 Answer 1

You have a race condition!

Another race condition

Moving files the correct way

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions