API call scheduler for Python

Question 1

My favourite 3rd party library isn't getting maintained so now I need to make my own library for interfacing with Riot Games' API. Epic.

Problem is that there are rules, such as:

100 requests in 2 minutes
20 requests in 1 second

So I made an API request scheduler.

import asyncio
import time
from collections import deque
from typing import Deque
import httpx
class SingleRegionCoordinator:
 WAIT_LENIENCY = 2
 def __init__(self, subdomain: str) -> None:
 self.subdomain = subdomain
 self.calls: Deque[float] = deque()
 self.base_url = f"https://{subdomain}.api.riotgames.com"
 self.force_wait = 0
 def _schedule_call(self) -> float:
 """
 Schedule an API call. This call must be atomic (call finishes before
 being called by another coroutine).
 Returns:
 float: The time in seconds to wait to make the API call
 """
 now = time.time()
 # Remove all calls older than 2 minutes
 while self.calls and self.calls[0] < now - 120:
 self.calls.popleft()
 # Figure out how long to wait before there will be less than 100 calls
 # in the last 2 minutes worth of requests
 rate_1s_time, rate_2m_time = 0, 0
 if len(self.calls) >= 100:
 rate_2m_time = (
 self.calls[-100] + 120 + SingleRegionCoordinator.WAIT_LENIENCY
 )
 # Figure out how long to wait before there will be less than 20 calls
 # in the last second worth of requests
 if len(self.calls) >= 20:
 rate_1s_time = (
 self.calls[-20] + 1 + SingleRegionCoordinator.WAIT_LENIENCY
 )
 scheduled_time = max(self.force_wait, rate_2m_time, rate_1s_time, now)
 self.calls.append(scheduled_time)
 return scheduled_time - now
 async def _api_call(
 self, method: str, path: str, params: dict = None
 ) -> dict:
 """
 Make an API call
 Args:
 method: The HTTP method to use
 path: The path to the API endpoint
 params: The parameters to pass to the API endpoint
 Returns:
 dict: The API response
 """
 # Schedule the call
 wait_time = self._schedule_call()
 await asyncio.sleep(wait_time)
 url = f"{self.base_url}{path}"
 headers = {"X-Riot-Token": "code edited for codereview"}
 response = await httpx.request(
 method, url, headers=headers, params=params
 )
 res = response.json()
 # Check if we got a rate limit error
 if res["status"]["status_code"] == 429:
 # Let the scheduler know that we are in trouble
 self.force_wait = (
 time.time() + 120 + SingleRegionCoordinator.WAIT_LENIENCY
 )
 return await self._api_call(method, path, params)
 return res

We can see it in action (sort of) with some quick test code to see what it tries to achieve:

# Not recommended in practice but makes this example clearer
SingleRegionCoordinator.WAIT_LENIENCY = 0
src = SingleRegionCoordinator("")
for _ in range(120):
 src._schedule_call()
arr = np.array(src.calls)
# arr is now the time to wait that the function tells all of the API calls
arr -= arr.min()
print(arr)

Output:

[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 1. 1. 1. 1. 1. 1. 1. 1.
 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 2. 2.
 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2.
 2. 2. 2. 2. 3. 3. 3. 3. 3. 3. 3. 3. 3. 3.
 3. 3. 3. 3. 3. 3. 3. 3. 3. 3. 4. 4. 4. 4.
 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4.
 4. 4. 120. 120. 120. 120. 120. 120. 120. 120. 120. 120. 120. 120.
 120. 120. 120. 120. 120. 120. 120. 120.]

Works as intended. It greedily schedules API calls as soon as possible, but respects the rules.

I know that:

This solution won't horizontally scale.
If the application closed/reopened it's going to run into a little trouble (though it can resolve by itself).
Exponential backoff is the typical go-to solution for error handling with APIs. However, I find it to be no good with rate limit scenarios, as batching, say, 1000 requests would eventually lead you to be waiting way more time than you need. This is supposed to be the optimal scheduler.
There are additional techniques such as caching requests.

My main questions are:

What are the general best practices when it comes to dealing with API rate limiting? This solution works in my mind but I have little idea how it's actually going to behave in the wild.
Code style is fine? I now use the Black autoformatter with line length 79 plus import ordering (builtin, external, internal, alphabetical). Is 79 line length a thing of the past yet or is that not important? I like short line lengths because it means I can put multiple scripts side-by-side easily.
Is -> dict OK? I really like Typescript's interface, where I could do something like:

interface SomethingDTO {
 id: string
 data: Array<SomethingElseDTO>
}

Instead now I have to write classes and... you know. A lot of work and honestly, hardly worth the effort either. Or am I delusional? Even this suggested alternative is not that good since it doesn't deal with nested objects. Some API responses are also just a straight PITA to typehint.

Question 2

Don't use time.time; use time.monotonic - otherwise, certain OS time changes are going to deliver a nasty surprise.

Make a constant for your 120 seconds.

You ask:

Is -> dict OK?

Not really. This:

params: dict = None

is first of all incorrect since it would need to be Optional[dict], which is basically Optional[Dict[Any, Any]]. Setting aside the outer Optional, in decreasing order of type strength, your options are roughly:

TypedDict
Dict[str, str] if all of your values are strings but you don't enforce key names
Dict[str, Union[...]] if you know of a value type set and don't enforce key names
Dict[str, Any] if you have no idea what the values are

That's all assuming that you're stuck using a dictionary. Keep in mind that all of the above hinting is hinting only, and is not enforced in runtime. If you want meaningful runtime type enforcement (which, really, you should) then move to @dataclasses. They're really not a PITA; they're basically the lightest-weight class definition mechanism and have a convenience asdict which makes API integration a breeze. This is effort worth investing if you at all care about program correctness.

Reinderien 71.2k5 gold badges76 silver badges257 bronze badges · Accepted Answer · 2021-12-05 00:13:47Z

Don't use time.time; use time.monotonic - otherwise, certain OS time changes are going to deliver a nasty surprise.

Make a constant for your 120 seconds.

You ask:

Is -> dict OK?

Not really. This:

params: dict = None

is first of all incorrect since it would need to be Optional[dict], which is basically Optional[Dict[Any, Any]]. Setting aside the outer Optional, in decreasing order of type strength, your options are roughly:

TypedDict
Dict[str, str] if all of your values are strings but you don't enforce key names
Dict[str, Union[...]] if you know of a value type set and don't enforce key names
Dict[str, Any] if you have no idea what the values are

That's all assuming that you're stuck using a dictionary. Keep in mind that all of the above hinting is hinting only, and is not enforced in runtime. If you want meaningful runtime type enforcement (which, really, you should) then move to @dataclasses. They're really not a PITA; they're basically the lightest-weight class definition mechanism and have a convenience asdict which makes API integration a breeze. This is effort worth investing if you at all care about program correctness.

Stack Exchange Network

API call scheduler for Python

1 Answer 1

You must log in to answer this question.

Hot Network Questions

API call scheduler for Python

1 Answer 1

You must log in to answer this question.

Related

Hot Network Questions