Conventions  /  Rate limits

Rate limits

Limits are a flat per-minute quota, not tiered by plan. They protect the API from runaway loops, with plenty of headroom for normal use.

100
requests / minute
REST API
Per API key
240
requests / minute
MCP server
Per credential
5
requests / minute
Public endpoints
Per IP

The window is a 60 second sliding window. Authenticated limits are keyed by API key, so two keys behind one IP each get a full quota, and one key used across many IPs still shares a single quota.

When you hit a limit

Over quota, the API returns 429 with the error code rate_limited. The Retry-After response header carries the number of seconds until the window resets. Sleep that long, then retry.

429 Too Many Requests
Retry-After: 23

{
  "error": {
    "type": "rate_limit_error",
    "code": "rate_limited",
    "message": "Too many requests. Retry after 23 seconds."
  }
}