0

I wrote a very simple FastAPI/Celery/Redis/Flower program to start understanding their working. Repo is https://github.com/rjalexa/fastapi-redis if you care to see.

The FastAPI route will look for a string in the Redis cache and if it finds it returns a hash/dict and code=200.

If it does not it will pass that string to Celery that would trigger a long running computation (not in this dummy repo but in real life, it could take up to ten seconds) and return a code=202. When processing is done the results will be added to the Redis cache.

What I want to avoid is that when the string is queued to be processed, if I receive a request for the same string (the processing could take many seconds), I just want to return a code=202 without a new task being queued for the very same string.

Thanks for any clarification.

asked Nov 5, 2023 at 16:52
2
  • Why don't you use a second table for signaling that you're preparing that data? Or already add the entry in the redis database with an empty value? Commented Nov 5, 2023 at 21:19
  • Thank you @Isabi. The second idea would mean I would get a cache hit with no data and in that case I should still return a 202. Did I understand your suggestion? The first idea I need to understand better since I've never heard the "table" concept re Redis. Commented Nov 6, 2023 at 6:24

1 Answer 1

1

I'm answering here instead of using the comments, since the answer is long.

Premise

I have never used redis, so take what I say with a grain of salt. I use the term table as I arrive from a SQL background, so it may be incorrect.

Idea 1

The underlying idea here is to divide the data in two: the cache and the preparation/pre-cache. The cache is simply the cache with the data, while the pre-cache is a sort of space where to park the data values that are being prepared. Once prepared, they are moved from the pre-cache to the actual cache. A missed hit on the cache will trigger a hit on the pre-cache. In this case you may return some info to the user that the data is not ready yet.

Idea 2

This idea combines the two tables of Idea 1 into one single table. You receive a request that has a valid entry in the cache with a value that is non-null, return the value. If there's a cache miss then, it means that the data has to be computed. Compute it but create first an entry in the cache. This way, if a second request arrives for the same data, the program will find an empty value, recognizing that the computation has started but is not finished yet. Your program will then behave as you decide for this edge case.

answered Nov 6, 2023 at 21:25
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks a lot. That makes sense. My current python applicative solution seems to much simpler :) If I have a cache miss I place the key in a python set which acts as my "queue". The beauty is that inserting an existing key in the set will just be ignored. Simpler, but of course not as powerful .... Take care
But upon further investigation maybe the right solution is the SADD (publish to a REDIS set) :)
Didn't know it was called SADD but yes, that's more or less the idea that I had in mind

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.