Client-side Token Bucket Traffic Shaper to prevent 429 errors on rate-limited APIs (free/trial keys, local models) #1321
-
Hi everyone! 👋
I've been using OpenClaude recently and absolutely love the project. However, like many others using restricted-tier APIs, evaluation accounts, or local endpoints (such as the NVIDIA NIM trial tier limited to 40 RPM, Groq, or OpenRouter free tiers), I ran into a major bottleneck: frequent 429 Too Many Requests cascades.
🔍 The Problem
Currently, OpenClaude has no client-side rate limiting or request scheduling. When multiple subagents or tools run in parallel, they dispatch a sudden burst of HTTP requests. For limited-tier users:
- The external API immediately rejects the burst with a
429error. - OpenClaude's retry mechanism kicks in and resends requests rapidly, amplifies the thundering herd problem, and thrashes prompt cache.
- This makes the tool almost unusable for anyone without high-tier, pay-as-you-go enterprise keys.
💡 The Proposal: Client-Side Token Bucket Traffic Shaper
To make OpenClaude accessible to developers on limited/free tiers, I propose adding a client-side Token Bucket Traffic Shaper with a bounded FIFO Request Queue.
Here is how the architecture looks:
Beta Was this translation helpful? Give feedback.
All reactions
Replies: 2 comments
-
I have already fully implemented this solution locally, including: high-quality TypeScript code, a comprehensive test suite featuring 19 test cases...
Beta Was this translation helpful? Give feedback.
All reactions
-
Please report with more details here: https://github.com/Gitlawb/openclaude/issues
Beta Was this translation helpful? Give feedback.