I have a set of objects states which is greater than I think it would be reasonable to thread or process at a 1:1 basis, let's say it looks like this
class SubState(object):
def __init__(self):
self.stat_1 = None
self.stat_2 = None
self.list_1 = []
class State(object):
def __init__(self):
self.my_sub_states = {'a': SubState(), 'b': SubState(), 'c': SubState()}
What I'd like to do is to make each of the sub_states to the self.my_sub_states keys shared, and simply access them by grabbing a single lock for the entire sub-state - i.e. self.locks={'a': multiprocessing.Lock(), 'b': multiprocessing.Lock() etc. and then release it when I'm done. Is there a class I can inherit to share an entire SubState object with a single Lock?
The actually process workers would be pulling tasks from a queue (I can't pass the sub_states as args into the process because they don't know which sub_state they need until they get the next task).
Edit: also I'd prefer not to use a manager - manager's are atrociously slow (I haven't done the benchmarks but I'm inclined to think an in memory database would work faster than a manager if it came down to it).
1 Answer 1
As the multiprocessing docs state, you've really only got two options for actually sharing state between multiprocessing.Process instances (at least without going to third-party options - e.g. redis):
- Use a
Manager - Use
multiprocessing.sharedctypes
A Manager will allow you to share pure Python objects, but as you pointed out, both read and write access to objects being shared this way is quite slow.
multiprocessing.sharedctypes will use actual shared memory, but you're limited to sharing ctypes objects. So you'd need to convert your SubState object to a ctypes.Struct. Also of note is that each multiprocessing.sharedctypes object has its own lock built-in, so you can synchronize access to each object by taking that lock explicitly before operating on it.
3 Comments
ctypes (it sounds like I'll have a signaling problem then too - one could use a manager or a shared hash table to keep track of all of the shared c.structs but it's not too pretty) I'm really leaning towards Redis - and rewriting parts of the code in C if Redis proves too slow.
self.locks(so everyone can read from it in parallel) and a single lock aroundself.sub_statesto prevent concurrent writing or no lock aroundself.sub_stateseither?