RateLimiter¶
The rate limiter pattern controls the rate of calls to a function using a token bucket algorithm. Unlike the bulkhead (which limits concurrent calls), the rate limiter limits calls per time period.
Concepts¶
The rate limiter uses a token bucket algorithm:
- The bucket holds a maximum number of tokens (equal to
max_calls) - Each call consumes one token
- Tokens are refilled continuously at a rate of
max_calls / period - If no tokens are available, the call is either rejected or waits
Token Bucket (max_calls=5, period=1.0)
Time 0.0s: [*] [*] [*] [*] [*] (5 tokens)
Call 1: [*] [*] [*] [*] [ ] (4 tokens)
Call 2: [*] [*] [*] [ ] [ ] (3 tokens)
Call 3: [*] [*] [ ] [ ] [ ] (2 tokens)
Call 4: [*] [ ] [ ] [ ] [ ] (1 token)
Call 5: [ ] [ ] [ ] [ ] [ ] (0 tokens)
Call 6: REJECTED! (or waits for refill)
Time 0.2s: [*] [ ] [ ] [ ] [ ] (1 token refilled)
Call 6: [ ] [ ] [ ] [ ] [ ] (succeeds now)
Rate Limiter vs Bulkhead¶
| Rate Limiter | Bulkhead | |
|---|---|---|
| Limits | Calls per time window | Concurrent calls |
| Use case | API rate limits, throttling | Connection pool protection |
| Example | 100 requests/minute | 10 concurrent requests |
| Algorithm | Token bucket | Semaphore |
Configuration¶
from pyresilience import RateLimiterConfig
config = RateLimiterConfig(
max_calls=10, # 10 calls per period
period=1.0, # Per 1 second
max_wait=0.0, # Reject immediately if no tokens (0 = no waiting)
)
| Parameter | Type | Default | Description |
|---|---|---|---|
max_calls |
int |
10 |
Maximum number of calls allowed per period |
period |
float |
1.0 |
Time period in seconds |
max_wait |
float |
0.0 |
Maximum seconds to wait for a token. 0 means reject immediately. |
Usage¶
Basic Rate Limiting¶
from pyresilience import resilient, RateLimiterConfig
# Allow 10 requests per second
@resilient(rate_limiter=RateLimiterConfig(max_calls=10, period=1.0))
def call_api(endpoint: str) -> dict:
return requests.get(endpoint).json()
With Waiting¶
Allow callers to wait for a token instead of immediate rejection:
# 100 calls per minute, wait up to 5 seconds for a token
@resilient(rate_limiter=RateLimiterConfig(
max_calls=100,
period=60.0,
max_wait=5.0,
))
def call_api() -> dict:
return requests.get("https://api.example.com").json()
Async Rate Limiting¶
@resilient(rate_limiter=RateLimiterConfig(max_calls=50, period=1.0))
async def async_call() -> dict:
async with aiohttp.ClientSession() as session:
async with session.get("https://api.example.com") as resp:
return await resp.json()
With Fallback¶
Return a queued/deferred response instead of rejecting:
from pyresilience import (
resilient, RateLimiterConfig, FallbackConfig, RateLimitExceededError
)
@resilient(
rate_limiter=RateLimiterConfig(max_calls=10, period=1.0),
fallback=FallbackConfig(
handler=lambda e: {"status": "rate_limited", "retry_after": 1},
fallback_on=(RateLimitExceededError,),
),
)
def call_api() -> dict:
return requests.get("https://api.example.com").json()
Combined with Circuit Breaker¶
@resilient(
rate_limiter=RateLimiterConfig(max_calls=100, period=60.0),
circuit_breaker=CircuitBreakerConfig(failure_threshold=5),
retry=RetryConfig(max_attempts=3),
)
def robust_call() -> dict:
return requests.get("https://api.example.com").json()
Events¶
| Event | When |
|---|---|
EventType.RATE_LIMITED |
A call was rejected because the rate limit was exceeded |
Exception¶
from pyresilience import RateLimitExceededError
try:
result = my_function()
except RateLimitExceededError:
# Rate limit exceeded, try again later
time.sleep(1)
Direct Usage¶
Use the rate limiter without the decorator: