Rate Limits
AICredits rate limit tiers — RPM limits, concurrency caps, budget limits, response headers, 429 handling, and retry strategies.
Use this page with an AI assistant
Opens a new chat with this docs URL and the correct AICredits base URLs.
AICredits enforces rate limits and concurrency limits to ensure fair usage and system stability.
RPM Limits (Requests Per Minute)
Each API key has a maximum number of requests per minute. The default is 60 RPM, but this is configurable per key (up to 10,000 RPM).
Rate limits are enforced per API key using a rolling 1-minute window backed by Redis. When you exceed the limit, requests return a 429 Too Many Requests response.
Concurrency Limits
Concurrency limits cap the number of simultaneous active requests per user. This prevents a single user from monopolizing system resources.
| Aspect | Details |
|---|---|
| Default Limit | 5 concurrent requests |
| Error Code | 429 with concurrency-specific message |
Response Headers
Every API response includes rate limit headers so you can monitor your usage:
| Header | Description |
|---|---|
X-RateLimit-Limit | Maximum requests per minute for this key |
X-RateLimit-Remaining | Requests remaining in current window |
X-RateLimit-Reset | Seconds until the rate limit window resets |
HTTP/1.1 200 OK
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 42
X-RateLimit-Reset: 35
Content-Type: application/jsonBudget Limits
Each API key can have an optional budget limit in INR. When the key's cumulative usage exceeds the budget, further requests are rejected with a 402 status.
Use budget limits to prevent unexpected costs. Set them when creating API keys, especially for development or testing keys.
Handling 429 Errors
When rate limited, implement exponential backoff with jitter:
import time
import random
from openai import OpenAI, RateLimitError
client = OpenAI(
base_url="https://api.aicredits.in/v1",
api_key="sk-your-key-here",
)
def chat_with_retry(messages, max_retries=5):
for attempt in range(max_retries):
try:
return client.chat.completions.create(
model="openai/gpt-4o-mini",
messages=messages,
)
except RateLimitError:
if attempt == max_retries - 1:
raise
wait = (2 ** attempt) + random.uniform(0, 1)
print(f"Rate limited. Retrying in {wait:.1f}s...")
time.sleep(wait)import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.aicredits.in/v1",
apiKey: "sk-your-key-here",
});
async function chatWithRetry(
messages: OpenAI.ChatCompletionMessageParam[],
maxRetries = 5
) {
for (let attempt = 0; attempt < maxRetries; attempt++) {
try {
return await client.chat.completions.create({
model: "openai/gpt-4o-mini",
messages,
});
} catch (error: any) {
if (error?.status !== 429 || attempt === maxRetries - 1) throw error;
const wait = 2 ** attempt + Math.random();
console.log(`Rate limited. Retrying in ${wait.toFixed(1)}s...`);
await new Promise((r) => setTimeout(r, wait * 1000));
}
}
}