Background
I have two separate processes, WriteIt()
and ReadIt()
. One creates records, and the other processes the records in a DB cluster.
Once WriteIt()
creates a record, it queues a ReadIt()
task to process the same record.
To illustrate:
enter image description here
Unfortunately, the database write and replication takes an unreliable amount of time, so ReadIt()
has to keep checking for the presence of the updated record, which seems quite inefficient.
Question
This has got to be a common pattern for distributed systems. So my questions are:
Is there a general term (or terms) for this pattern, so that I can read about how to solve it? Unfortunately I don't even know what the right terminology is so I've had a heck of a time doing research on Google/SO/Programmers.SE.
(for extra credit) Is there an specific common approach to solving this issue with SQLAlchemy/MySQL and Celery?
I recognize that #2 is pretty specific, so I would be happy with just #1 since I just need to be pointed in the right direction to research the pattern.
1 Answer 1
Specifically yes the name of the pattern in distributed systems is called eventual consistency.
The common approach is to synchronously write the data to an event store and then write to SQL. Your queued background read job can rest assured that once the data is in the event store, it's a success and won't be lost.
Usually people use a high-performance storage system that is very fast at handling and serialising a lot of concurrent writes for the event store.
A good approach is often called command-query responsibility separation with event sourcing.
-
Thanks! This is a great set of terms to get me going....exactly what I was looking for! +1...will wait a couple days to accept the answer in case other responses trickle in, but thanks so muchtohster– tohster2015年04月01日 06:07:25 +00:00Commented Apr 1, 2015 at 6:07
Explore related questions
See similar questions with these tags.
ReadIt()
after inserting? It'd be much faster and reliable. If record needs to be updated byReadIt()
you can update it by ID or something.ReadIt()
does a series of tasks which involve querying the DB for a family of records including the one just added. It then does a group calculation across all of those records, so I'd like to make sure I'm doing these operations only after the new record is added.