Rate Limits and Retries

Per-key rate limits (RPM)

Rate limits are enforced per API key, measured as requests per minute (RPM).

Plan	Default RPM
Free	30
Starter	100
Pro	500
Enterprise	Unlimited (by contract)

Notes:

A key's RPM can be set below the plan maximum at creation time via the Rate Limit field in the console.
Exceeding the limit returns 429 RATE_LIMIT_EXCEEDED.
The default hard cap per key is 100 RPM. Higher limits require an Enterprise contract.

Recommended retry strategy

For 429 and 5xx responses, retry with exponential backoff and jitter.

Safe defaults:

Retry budget: 3 attempts total.
Initial delay: 1 second.
Backoff multiplier: 2×.
Jitter: ±20%.
Do not retry 400, 401, or 402 — they will never succeed without a code change.

import random, time, requests

def request_with_retry(method, url, **kwargs):
    delay = 1.0
    for attempt in range(3):
        r = requests.request(method, url, timeout=30, **kwargs)
        if r.status_code < 400:
            return r
        if r.status_code in (400, 401, 402):
            r.raise_for_status()
        if attempt == 2:
            r.raise_for_status()
        time.sleep(delay * (1 + random.uniform(-0.2, 0.2)))
        delay *= 2

Concurrency

Concurrent requests from the same key count against the same RPM budget.
There is no documented maximum concurrent connection count. In practice, keep concurrency below RPM/60 to avoid bursty 429s (e.g., ≤ 8 concurrent for Pro at 500 RPM).
Split workloads across multiple keys only when your plan allows multiple keys (Starter ≥ 3, Pro / Enterprise unlimited).

Idempotency

Requests are idempotent in the semantic sense — the same audio returns the same analysis — but each call is billed. If you retry after a network timeout, you may be charged twice. Deduplicate on the client if this matters.

Per-key rate limits (RPM)​

Recommended retry strategy​

Concurrency​

Idempotency​

Related pages​

Per-key rate limits (RPM)

Recommended retry strategy

Concurrency

Idempotency

Related pages