My favourite 3rd party library isn't getting maintained so now I need to make my own library for interfacing with Riot Games' API. Epic.
Problem is that there are rules, such as:
- 100 requests in 2 minutes
- 20 requests in 1 second
So I made an API request scheduler.
import asyncio
import time
from collections import deque
from typing import Deque
import httpx
class SingleRegionCoordinator:
WAIT_LENIENCY = 2
def __init__(self, subdomain: str) -> None:
self.subdomain = subdomain
self.calls: Deque[float] = deque()
self.base_url = f"https://{subdomain}.api.riotgames.com"
self.force_wait = 0
def _schedule_call(self) -> float:
"""
Schedule an API call. This call must be atomic (call finishes before
being called by another coroutine).
Returns:
float: The time in seconds to wait to make the API call
"""
now = time.time()
# Remove all calls older than 2 minutes
while self.calls and self.calls[0] < now - 120:
self.calls.popleft()
# Figure out how long to wait before there will be less than 100 calls
# in the last 2 minutes worth of requests
rate_1s_time, rate_2m_time = 0, 0
if len(self.calls) >= 100:
rate_2m_time = (
self.calls[-100] + 120 + SingleRegionCoordinator.WAIT_LENIENCY
)
# Figure out how long to wait before there will be less than 20 calls
# in the last second worth of requests
if len(self.calls) >= 20:
rate_1s_time = (
self.calls[-20] + 1 + SingleRegionCoordinator.WAIT_LENIENCY
)
scheduled_time = max(self.force_wait, rate_2m_time, rate_1s_time, now)
self.calls.append(scheduled_time)
return scheduled_time - now
async def _api_call(
self, method: str, path: str, params: dict = None
) -> dict:
"""
Make an API call
Args:
method: The HTTP method to use
path: The path to the API endpoint
params: The parameters to pass to the API endpoint
Returns:
dict: The API response
"""
# Schedule the call
wait_time = self._schedule_call()
await asyncio.sleep(wait_time)
url = f"{self.base_url}{path}"
headers = {"X-Riot-Token": "code edited for codereview"}
response = await httpx.request(
method, url, headers=headers, params=params
)
res = response.json()
# Check if we got a rate limit error
if res["status"]["status_code"] == 429:
# Let the scheduler know that we are in trouble
self.force_wait = (
time.time() + 120 + SingleRegionCoordinator.WAIT_LENIENCY
)
return await self._api_call(method, path, params)
return res
We can see it in action (sort of) with some quick test code to see what it tries to achieve:
# Not recommended in practice but makes this example clearer
SingleRegionCoordinator.WAIT_LENIENCY = 0
src = SingleRegionCoordinator("")
for _ in range(120):
src._schedule_call()
arr = np.array(src.calls)
# arr is now the time to wait that the function tells all of the API calls
arr -= arr.min()
print(arr)
Output:
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 1. 1. 1. 1. 1. 1. 1. 1.
1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 2. 2.
2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2.
2. 2. 2. 2. 3. 3. 3. 3. 3. 3. 3. 3. 3. 3.
3. 3. 3. 3. 3. 3. 3. 3. 3. 3. 4. 4. 4. 4.
4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4.
4. 4. 120. 120. 120. 120. 120. 120. 120. 120. 120. 120. 120. 120.
120. 120. 120. 120. 120. 120. 120. 120.]
Works as intended. It greedily schedules API calls as soon as possible, but respects the rules.
I know that:
- This solution won't horizontally scale.
- If the application closed/reopened it's going to run into a little trouble (though it can resolve by itself).
- Exponential backoff is the typical go-to solution for error handling with APIs. However, I find it to be no good with rate limit scenarios, as batching, say, 1000 requests would eventually lead you to be waiting way more time than you need. This is supposed to be the optimal scheduler.
- There are additional techniques such as caching requests.
My main questions are:
- What are the general best practices when it comes to dealing with API rate limiting? This solution works in my mind but I have little idea how it's actually going to behave in the wild.
- Code style is fine? I now use the Black autoformatter with line length 79 plus import ordering (builtin, external, internal, alphabetical). Is 79 line length a thing of the past yet or is that not important? I like short line lengths because it means I can put multiple scripts side-by-side easily.
- Is
-> dict
OK? I really like Typescript'sinterface
, where I could do something like:
interface SomethingDTO {
id: string
data: Array<SomethingElseDTO>
}
Instead now I have to write classes and... you know. A lot of work and honestly, hardly worth the effort either. Or am I delusional? Even this suggested alternative is not that good since it doesn't deal with nested objects. Some API responses are also just a straight PITA to typehint.
1 Answer 1
Don't use time.time
; use time.monotonic
- otherwise, certain OS time changes are going to deliver a nasty surprise.
Make a constant for your 120 seconds.
You ask:
Is
-> dict
OK?
Not really. This:
params: dict = None
is first of all incorrect since it would need to be Optional[dict]
, which is basically Optional[Dict[Any, Any]]
. Setting aside the outer Optional
, in decreasing order of type strength, your options are roughly:
TypedDict
Dict[str, str]
if all of your values are strings but you don't enforce key namesDict[str, Union[...]]
if you know of a value type set and don't enforce key namesDict[str, Any]
if you have no idea what the values are
That's all assuming that you're stuck using a dictionary. Keep in mind that all of the above hinting is hinting only, and is not enforced in runtime. If you want meaningful runtime type enforcement (which, really, you should) then move to @dataclass
es. They're really not a PITA; they're basically the lightest-weight class definition mechanism and have a convenience asdict which makes API integration a breeze. This is effort worth investing if you at all care about program correctness.