I'm trying to design a data updates mechanism in my micro-services architecture.
For the sake of simplicity, let's assume we have two micro-services A and B, B exposes an API for creating some tasks, using simple REST, POST /tasks
, which creates a task and returns a unique task identifier to query on - task_id
. Then any created task can be queried on status using another API endpoint: GET /tasks/{task_id}
. Now A can create tasks and use polling mechanism to track progress.
The next improvement, we would like to add is "push API" - progress updates asynchronously using a message broker (e.g., RabbitMQ). Now, whenever the status has changed, B will publish a data update using a message broker and A will get this update instead of polling.
This is the expected flow:
- A requests B to create a task synchronously
- A subscribes to changes of
tasks.{task_id}
- B publishes a change of
task_id
Steps 2 + 3 can be re-ordered causing A to miss updates or even never get any at all (if the task was completed before).
The only way to handle this race condition I can think of is to change step 2:
- A subscribes to changes of
tasks.{task_id}
- A queries for current status
GET /tasks/{task_id}
- For any received notification we need to check that it is a newer version than the state received in the manual querying (and vice-versa).
Is there another approach or a better practice for this problem?
2 Answers 2
For terminology: For the task-completed events, B is the producer and A is the consumer.
Create a queue per consumer
When creating the task, identify the creator/consumer. Send the task-completed event to a queue specific to that consumer. This way, the consumer doesn't need to process any messages from the queue that are not relevant to it and you avoid race conditions.
While thinking about it, this may even be a good idea: Allow the task-creator/consumer to specify the queue to which the completion message should be sent to when the task is created.
-
I think it couples the producers and consumers. The producer should not know its subscribers...Sawel– Sawel2022年06月08日 11:25:49 +00:00Commented Jun 8, 2022 at 11:25
-
Then have the queue name be Parameter to create the taskmarstato– marstato2022年06月08日 16:37:39 +00:00Commented Jun 8, 2022 at 16:37
It sounds like you're overcomplicating this with subscriptions to specific (and short lived) queues for individual messages. You're effectively creating a Request/Response within Queues - possible, occasionally useful, but rarely required.
I think you should set it up like so:
- Setup permanent topics/subscriptions/queues for "RequestReceived" and "ThingProcessed". A subscribes to "ThingProcessed", B subscribes to "RequestReceived".
- A receives a request
- If the request is valid, A publishes a message to a topic/queue "RequestReceived", and responds to requster with a tracking ID.
- B receives the event from it's subscription. It processes, then publishes to outcome "ThingProcessed"
- A receives B's message via it's subscription.
You can't have race conditions here, as there's no temporary queues to receive specific updates. It has the added advantages of decoupling A and B (they just talk to a Queue, they have no idea who or what is sending/receiving their messages), and also allowing you to subscribe other systems in due course to those events, if required.
If the original requester to A wants an update, they can call A's API and see the status. Alternatively, A could be setup to make a call to the original requester when receiving it's update from B (via the message queue).
Requester -> A -> Publish "RequestReceived" -> B -> Publish "ThingProcessed" -> A -> Requester
tasks
not possible?