-
Notifications
You must be signed in to change notification settings - Fork 1.1k
PYTHON-5504 Prototype exponential backoff in with_transaction #2492
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Can you add an async version of the benchmark? Having both APIs be tested before merging this into the backpressure branch would be ideal.
Done. Added the scrip to the Jira ticket. The async version still shows a significant reduction in the number of wasted retries and lower p50 to p90 latency but less/no benefit for p99 and p100 latency.
Before:
$ python3.13 repro-storm-async.py
Completed 200 concurrent async transactions in 3.467694044113159 seconds
Total retry attempts: 5950
avg latency: 1.87s p50: 2.04s p90: 3.22s p99: 3.45s p100: 3.46s
After:
$ python3.13 repro-storm-async.py
Completed 200 concurrent async transactions in 3.5634748935699463 seconds
Total retry attempts: 887
avg latency: 1.48s p50: 1.41s p90: 2.81s p99: 3.49s p100: 3.56s
Done. Added the scrip to the Jira ticket. The async version still shows a significant reduction in the number of wasted retries and lower p50 to p90 latency but less/no benefit for p99 and p100 latency.
Before:
$ python3.13 repro-storm-async.py Completed 200 concurrent async transactions in 3.467694044113159 seconds Total retry attempts: 5950 avg latency: 1.87s p50: 2.04s p90: 3.22s p99: 3.45s p100: 3.46s
After:
$ python3.13 repro-storm-async.py Completed 200 concurrent async transactions in 3.5634748935699463 seconds Total retry attempts: 887 avg latency: 1.48s p50: 1.41s p90: 2.81s p99: 3.49s p100: 3.56s
Async sees significantly less improvement with the backoff, but I'd say that's expected. Asyncio's cooperative multitasking structure already prevents a given operation from retrying before the other concurrent async tasks have had a chance to run (assuming the async/await code is written correctly).
cf7a1aa
into
mongodb:backpressure
Uh oh!
There was an error while loading. Please reload this page.
PYTHON-5504 Prototype exponential backoff in with_transaction.
Using the repro scrip in jira which runs 200 concurrent transactions in 200 threads all updating the same document shows a significant reduction in wasted retry attempts and latency (from p50 to p100). Before this change:
After (with 50ms initial backoff, 1000ms max backoff, and full jitter, backoff starting on the second retry attempt):
Backoff starting on the first retry attempt appears to work even better:
Note I'm using free-threaded mode to make this repro more similar to the behavior of other languages and other deployment types (eg many single threaded clients running on different machines).