Rate Limits & Idempotency

Per-key rate limit

Every API key is capped at 100 requests per second across all /v1/* endpoints. The cap is implemented as a sliding-window KV counter; bursts up to ~110 rps are tolerated under eventual consistency.

When you exceed the cap, the response is:

HTTP/1.1 429 Too Many Requests
Retry-After: 1
Content-Type: application/json

{ "error": "rate limited", "code": "RATE_LIMITED" }

Obey Retry-After (in seconds). We do NOT auto-queue; the caller is responsible for retry.

New-contact monthly limit

Plans cap new contacts per month, not message volume. Once a number is on your contact book (because you sent it a message in any prior period), further sends to that number are unlimited regardless of plan.

Plan	New contacts / day	New contacts / month
Starter ($49)	10	250
Growth ($249)	50	1,250
Scale ($749)	500	10,000

When the cap is hit, sends to new numbers fail with 403 QUOTA_EXCEEDED. Existing-contact sends keep flowing.

Idempotency

Until then, deduplicate caller-side. A common pattern:

const seen = new Map<string, string>()

async function sendOnce(payload: { to: string; body: string }) {
  const key = `${payload.to}::${payload.body}`
  if (seen.has(key)) return seen.get(key)!
  const result = await client.messages.send(payload)
  seen.set(key, result.message_id)
  return result.message_id
}

Burst handling

Hard cap is per-second. If you sustain 110+ rps for over a minute, expect intermittent 429s. The Retry-After value reflects when our window will reset — typically 1 second.

Webhook delivery rate

Inbound to your endpoint: best-effort, no rate cap from our side. We deliver as events fire. We retry up to 3 times with 2/4/8-second exponential backoff.

Voice rate limits

Outbound voice calls have a separate cap: 10 calls/minute/tenant. See Voice quickstart.