AICredits logo
Getting Started

Rate Limits

AICredits rate limit tiers — RPM limits, concurrency caps, budget limits, response headers, 429 handling, and retry strategies.

Use this page with an AI assistant

Opens a new chat with this docs URL and the correct AICredits base URLs.

AICredits enforces rate limits and concurrency limits to ensure fair usage and system stability.

RPM Limits (Requests Per Minute)

Each API key has a maximum number of requests per minute. The default is 60 RPM, but this is configurable per key (up to 10,000 RPM).

Rate limits are enforced per API key using a rolling 1-minute window backed by Redis. When you exceed the limit, requests return a 429 Too Many Requests response.

Concurrency Limits

Concurrency limits cap the number of simultaneous active requests per user. This prevents a single user from monopolizing system resources.

AspectDetails
Default Limit5 concurrent requests
Error Code429 with concurrency-specific message

Response Headers

Every API response includes rate limit headers so you can monitor your usage:

HeaderDescription
X-RateLimit-LimitMaximum requests per minute for this key
X-RateLimit-RemainingRequests remaining in current window
X-RateLimit-ResetSeconds until the rate limit window resets
Response Headers
HTTP/1.1 200 OK
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 42
X-RateLimit-Reset: 35
Content-Type: application/json

Budget Limits

Each API key can have an optional budget limit in INR. When the key's cumulative usage exceeds the budget, further requests are rejected with a 402 status.

Use budget limits to prevent unexpected costs. Set them when creating API keys, especially for development or testing keys.

Handling 429 Errors

When rate limited, implement exponential backoff with jitter:

import time
import random
from openai import OpenAI, RateLimitError

client = OpenAI(
    base_url="https://api.aicredits.in/v1",
    api_key="sk-your-key-here",
)

def chat_with_retry(messages, max_retries=5):
    for attempt in range(max_retries):
        try:
            return client.chat.completions.create(
                model="openai/gpt-4o-mini",
                messages=messages,
            )
        except RateLimitError:
            if attempt == max_retries - 1:
                raise
            wait = (2 ** attempt) + random.uniform(0, 1)
            print(f"Rate limited. Retrying in {wait:.1f}s...")
            time.sleep(wait)
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.aicredits.in/v1",
  apiKey: "sk-your-key-here",
});

async function chatWithRetry(
  messages: OpenAI.ChatCompletionMessageParam[],
  maxRetries = 5
) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    try {
      return await client.chat.completions.create({
        model: "openai/gpt-4o-mini",
        messages,
      });
    } catch (error: any) {
      if (error?.status !== 429 || attempt === maxRetries - 1) throw error;
      const wait = 2 ** attempt + Math.random();
      console.log(`Rate limited. Retrying in ${wait.toFixed(1)}s...`);
      await new Promise((r) => setTimeout(r, wait * 1000));
    }
  }
}

On this page