Rate Limits
The Envizion AI API uses per-minute rate limiting based on your plan tier. Rate limits are applied per API key and enforced via a Redis-backed sliding window.
Rate Limit Tiers
Your rate limit tier is determined by your account plan. Each API key inherits the tier of the account that created it.
Free
100 req/min
Starter
500 req/min
Pro
1,000 req/min
Admin
Unlimited
Rate Limit Headers
Every API response includes headers showing your current rate limit status:
| Header | Description | Example |
|---|---|---|
X-RateLimit-Limit | Maximum requests allowed per minute | 500 |
X-RateLimit-Remaining | Requests remaining in the current window | 487 |
X-RateLimit-Reset | Unix timestamp when the rate limit window resets | 1739373225 |
HTTP/1.1 200 OK
Content-Type: application/json
X-RateLimit-Limit: 500
X-RateLimit-Remaining: 487
X-RateLimit-Reset: 1739373225
X-Request-ID: req_abc123Handling 429 Responses
When you exceed the rate limit, the API returns a 429 Too Many Requests response with a Retry-After header indicating how many seconds to wait before retrying.
HTTP/1.1 429 Too Many Requests
Content-Type: application/json
Retry-After: 12
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1739373237
{
"error": "rate_limit_exceeded",
"detail": "Rate limit exceeded. Retry after 12 seconds.",
"status": 429,
"request_id": "req_abc123"
}Python
import time
import httpx
def api_request(url, headers):
resp = httpx.get(url, headers=headers)
if resp.status_code == 429:
retry_after = int(
resp.headers.get("Retry-After", 5)
)
print(f"Rate limited, waiting {retry_after}s")
time.sleep(retry_after)
return api_request(url, headers)
return resp.json()
# Or use the SDK (handles this automatically):
# client = EnvizionClient(api_key="vk_...")TypeScript
async function apiRequest(
url: string,
headers: Record<string, string>
) {
const resp = await fetch(url, { headers });
if (resp.status === 429) {
const retryAfter = parseInt(
resp.headers.get("Retry-After") ?? "5"
);
console.log(`Rate limited, waiting ${retryAfter}s`);
await new Promise(r =>
setTimeout(r, retryAfter * 1000)
);
return apiRequest(url, headers);
}
return resp.json();
}
// Or use the SDK (handles this automatically):
// const client = new EnvizionClient("vk_...");Best Practices
Use the official SDKs
Both the Python and TypeScript SDKs include automatic retry with exponential backoff on 429 responses.
Monitor remaining requests
Check the X-RateLimit-Remaining header to proactively throttle before hitting the limit.
Implement exponential backoff
If building a custom client, use exponential backoff with jitter instead of fixed retry intervals.
Cache responses when possible
Cache agent listings, tool catalogs, and other read endpoints that do not change frequently.
Use webhooks instead of polling
Instead of polling /v1/runs/:id every second, use SSE streaming or webhook notifications.
Spread requests over time
If you need to make many requests, distribute them evenly across the minute window rather than sending bursts.
Upgrading Your Tier
Your rate limit tier is tied to your plan. To increase your limit:
- Go to your dashboard
- Navigate to Settings > Billing
- Upgrade to Starter (500/min) or Pro (1,000/min)
- All existing API keys will automatically use the new tier