Rate Limits

The Envizion AI API uses per-minute rate limiting based on your plan tier. Rate limits are applied per API key and enforced via a Redis-backed sliding window.

Rate Limit Tiers

Your rate limit tier is determined by your account plan. Each API key inherits the tier of the account that created it.

Free

100 req/min

Starter

500 req/min

Pro

1,000 req/min

Admin

Unlimited

Rate Limit Headers

Every API response includes headers showing your current rate limit status:

Header	Description	Example
`X-RateLimit-Limit`	Maximum requests allowed per minute	500
`X-RateLimit-Remaining`	Requests remaining in the current window	487
`X-RateLimit-Reset`	Unix timestamp when the rate limit window resets	1739373225

Example response headers

HTTP/1.1 200 OK
Content-Type: application/json
X-RateLimit-Limit: 500
X-RateLimit-Remaining: 487
X-RateLimit-Reset: 1739373225
X-Request-ID: req_abc123

Handling 429 Responses

When you exceed the rate limit, the API returns a 429 Too Many Requests response with a Retry-After header indicating how many seconds to wait before retrying.

429 Response

HTTP/1.1 429 Too Many Requests
Content-Type: application/json
Retry-After: 12
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1739373237

{
  "error": "rate_limit_exceeded",
  "detail": "Rate limit exceeded. Retry after 12 seconds.",
  "status": 429,
  "request_id": "req_abc123"
}

Python

import time
import httpx

def api_request(url, headers):
    resp = httpx.get(url, headers=headers)

    if resp.status_code == 429:
        retry_after = int(
            resp.headers.get("Retry-After", 5)
        )
        print(f"Rate limited, waiting {retry_after}s")
        time.sleep(retry_after)
        return api_request(url, headers)

    return resp.json()

# Or use the SDK (handles this automatically):
# client = EnvizionClient(api_key="vk_...")

TypeScript

async function apiRequest(
  url: string,
  headers: Record<string, string>
) {
  const resp = await fetch(url, { headers });

  if (resp.status === 429) {
    const retryAfter = parseInt(
      resp.headers.get("Retry-After") ?? "5"
    );
    console.log(`Rate limited, waiting ${retryAfter}s`);
    await new Promise(r =>
      setTimeout(r, retryAfter * 1000)
    );
    return apiRequest(url, headers);
  }

  return resp.json();
}

// Or use the SDK (handles this automatically):
// const client = new EnvizionClient("vk_...");

Best Practices

Use the official SDKs

Both the Python and TypeScript SDKs include automatic retry with exponential backoff on 429 responses.

Monitor remaining requests

Check the X-RateLimit-Remaining header to proactively throttle before hitting the limit.

Implement exponential backoff

If building a custom client, use exponential backoff with jitter instead of fixed retry intervals.

Cache responses when possible

Cache agent listings, tool catalogs, and other read endpoints that do not change frequently.

Use webhooks instead of polling

Instead of polling /v1/runs/:id every second, use SSE streaming or webhook notifications.

Spread requests over time

If you need to make many requests, distribute them evenly across the minute window rather than sending bursts.

Upgrading Your Tier

Your rate limit tier is tied to your plan. To increase your limit:

Go to your dashboard
Navigate to Settings > Billing
Upgrade to Starter (500/min) or Pro (1,000/min)
All existing API keys will automatically use the new tier

→Error Reference →Authentication