
How to Use Multiple AI Models with One API Key (Python Tutorial)
A step-by-step Python tutorial for routing requests to GPT-4o, Claude, Gemini, and DeepSeek through a single API key — with cost tracking in ₹ for every call.
Author
AICredits Team
Published
14 Mar 2026
Reading time
7 min read
The problem with managing multiple LLM APIs
A typical AI application in 2026 uses at least 2–3 different models: a fast cheap model for triage, a capable model for generation, and sometimes a specialised model for specific tasks. Managing separate API keys, rate limits, and billing for each provider is operational overhead that compounds as your stack grows.
AICredits solves this with a single OpenAI-compatible endpoint that routes to 300+ models across all major providers. One key, one billing wallet, one usage dashboard.
Installation and setup
pip install openai
# Set your credentials
export AICREDITS_BASE_URL="https://api.aicredits.in/v1"
export AICREDITS_API_KEY="sk-your-aicredits-key"Basic usage: switching models by name
import os
from openai import OpenAI
client = OpenAI(
base_url=os.environ["AICREDITS_BASE_URL"],
api_key=os.environ["AICREDITS_API_KEY"],
)
# GPT-4o
response = client.chat.completions.create(
model="openai/gpt-4o",
messages=[{"role": "user", "content": "Explain async/await in Python."}],
)
# Claude 3.5 Sonnet — same SDK, same format
response = client.chat.completions.create(
model="anthropic/claude-3-5-sonnet-20241022",
messages=[{"role": "user", "content": "Refactor this function for readability."}],
)
# Gemini Flash — cheapest option
response = client.chat.completions.create(
model="google/gemini-2.0-flash-001",
messages=[{"role": "user", "content": "Classify this as spam or not spam."}],
)The gateway routes each request to the correct provider and returns an OpenAI-compatible response.
Cost-aware routing pattern
# Route tasks to the best model for cost vs quality
ROUTING_TABLE = {
"classify": "openai/gpt-4o-mini", # ₹14/M input — fast, cheap
"summarise": "anthropic/claude-3-5-haiku-20241022", # ₹96/M — better quality
"generate": "anthropic/claude-3-5-sonnet-20241022", # ₹289/M — complex tasks
"code": "anthropic/claude-3-5-sonnet-20241022", # ₹289/M — best code model
}
def ask(task_type: str, prompt: str) -> str:
model = ROUTING_TABLE.get(task_type, "openai/gpt-4o-mini")
response = client.chat.completions.create(
model=model,
messages=[{"role": "user", "content": prompt}],
)
usage = response.usage
print(f"[{task_type}] model={model} tokens={usage.prompt_tokens}+{usage.completion_tokens}")
return response.choices[0].message.content
# Each task uses the appropriate model
label = ask("classify", "Classify: 'My payment failed' → billing/technical/general")
summary = ask("summarise", "Summarise this 500-word article in 2 sentences: ...")
reply = ask("generate", "Write a professional reply to this customer complaint: ...")Fallback handling
AICredits handles provider-level failover automatically. For application-level fallback (fall back to a cheaper model after a timeout), wrap your call in a try/except:
from openai import APIStatusError
def resilient_ask(prompt: str) -> str:
for model in ["anthropic/claude-3-5-sonnet-20241022", "openai/gpt-4o-mini"]:
try:
response = client.chat.completions.create(
model=model,
messages=[{"role": "user", "content": prompt}],
timeout=10,
)
return response.choices[0].message.content
except (APIStatusError, TimeoutError):
continue
raise RuntimeError("All models failed")Log the model used, token counts, and response time for each request. This data feeds the cost analysis that helps you refine your routing strategy over time.
Related Articles
Continue in Docs
Need implementation commands and endpoint details? Go to quickstart or API reference.