Rate Limits and Retries
Per-key rate limits (RPM)
Rate limits are enforced per API key, measured as requests per minute (RPM).
| Plan | Default RPM |
|---|---|
| Free | 30 |
| Starter | 100 |
| Pro | 500 |
| Enterprise | Unlimited (by contract) |
Notes:
- A key's RPM can be set below the plan maximum at creation time via the Rate Limit field in the console.
- Exceeding the limit returns
429 RATE_LIMIT_EXCEEDED. - The default hard cap per key is 100 RPM. Higher limits require an Enterprise contract.
Recommended retry strategy
For 429 and 5xx responses, retry with exponential backoff and jitter.
Safe defaults:
- Retry budget: 3 attempts total.
- Initial delay: 1 second.
- Backoff multiplier: 2×.
- Jitter: ±20%.
- Do not retry
400,401, or402— they will never succeed without a code change.
import random, time, requests
def request_with_retry(method, url, **kwargs):
delay = 1.0
for attempt in range(3):
r = requests.request(method, url, timeout=30, **kwargs)
if r.status_code < 400:
return r
if r.status_code in (400, 401, 402):
r.raise_for_status()
if attempt == 2:
r.raise_for_status()
time.sleep(delay * (1 + random.uniform(-0.2, 0.2)))
delay *= 2
Concurrency
- Concurrent requests from the same key count against the same RPM budget.
- There is no documented maximum concurrent connection count. In practice, keep concurrency below RPM/60 to avoid bursty 429s (e.g., ≤ 8 concurrent for Pro at 500 RPM).
- Split workloads across multiple keys only when your plan allows multiple keys (Starter ≥ 3, Pro / Enterprise unlimited).
Idempotency
Requests are idempotent in the semantic sense — the same audio returns the same analysis — but each call is billed. If you retry after a network timeout, you may be charged twice. Deduplicate on the client if this matters.