Uptime & Reliability
AICredits SLA, uptime targets, incident history, and status page. Built on multi-key provider health tracking with automatic failover.
Use this page with an AI assistant
Opens a new chat with this docs URL and the correct AICredits base URLs.
AICredits is designed for production workloads. This page covers our uptime commitments, health tracking, and what happens when an upstream provider has an outage.
Service Architecture
AICredits sits in front of multiple LLM providers. When you make a request, the proxy:
- Validates your API key (Redis cache, 5ms typical)
- Applies rate limiting and guardrails
- Routes to the primary provider using a healthy API key
- Falls back automatically if the primary is unavailable
This multi-layer architecture means a single provider outage rarely causes an AICredits outage.
Circuit Breaker
Each provider API key has its own health state. When a key returns repeated errors (5xx or 429s), it is marked unhealthy and skipped for 30 seconds. Requests are automatically routed to other healthy keys.
| State | Trigger | Duration |
|---|---|---|
| Healthy | Successful request | Default |
| Unhealthy | Repeated 5xx / timeout | 30 seconds |
| Recovered | First successful request | Immediate |
SLA Targets
| Metric | Target |
|---|---|
| API availability | 99.9% monthly uptime |
| P50 latency (chat completions) | < 500ms to first token |
| P99 latency | < 5s to first token |
| Error budget | < 0.1% of requests |
Uptime is measured on the AICredits proxy layer. Provider-side latency and errors (counted in your usage logs) are separate from AICredits infrastructure availability.
Health Check Endpoints
Use these endpoints for your own uptime monitoring:
| Endpoint | Description |
|---|---|
GET /health | Returns 200 if the server is running |
GET /health/ready | Returns 200 if database + Redis connections are healthy |
GET /health/live | Kubernetes liveness probe endpoint |
curl https://api.aicredits.in/health
# {"status": "ok"}
curl https://api.aicredits.in/health/ready
# {"status": "ready", "db": "ok", "redis": "ok"}Incident Response
When an upstream provider has a major incident:
- The circuit breaker automatically marks affected keys as unhealthy
- Traffic shifts to healthy keys or other providers
- If all routes for a model are unavailable, requests return
502 Bad Gateway - The proxy retries with exponential backoff (500ms → 1s → 2s, max 3 attempts)
For extended provider outages, check the status page of the affected provider directly.
Retry Guidance
For production workloads, implement client-side retries for 429, 500, 502, and 504 responses:
import time
import random
from openai import OpenAI, APIError
client = OpenAI(
base_url="https://api.aicredits.in/v1",
api_key="sk-your-key-here",
max_retries=0, # Handle retries manually for more control
)
def call_with_retry(messages, max_attempts=4):
for attempt in range(max_attempts):
try:
return client.chat.completions.create(
model="openai/gpt-4o-mini",
messages=messages,
)
except APIError as e:
if e.status_code in (429, 500, 502, 504) and attempt < max_attempts - 1:
wait = (2 ** attempt) + random.uniform(0, 0.5)
time.sleep(wait)
else:
raiseSee the Error Handling guide for the full retry matrix.