How to Use Multiple AI Models with One API Key (Python Tutorial)

How to Use Multiple AI Models with One API Key (Python Tutorial)

A step-by-step Python tutorial for routing requests to GPT-4o, Claude, Gemini, and DeepSeek through a single API key — with cost tracking in ₹ for every call.

Author

AICredits Team

Published

14 Mar 2026

Reading time

7 min read

The problem with managing multiple LLM APIs

A typical AI application in 2026 uses at least 2–3 different models: a fast cheap model for triage, a capable model for generation, and sometimes a specialised model for specific tasks. Managing separate API keys, rate limits, and billing for each provider is operational overhead that compounds as your stack grows.

AICredits solves this with a single OpenAI-compatible endpoint that routes to 300+ models across all major providers. One key, one billing wallet, one usage dashboard.

Installation and setup

pip install openai
 
# Set your credentials
export AICREDITS_BASE_URL="https://api.aicredits.in/v1"
export AICREDITS_API_KEY="sk-your-aicredits-key"

Basic usage: switching models by name

import os
from openai import OpenAI
 
client = OpenAI(
    base_url=os.environ["AICREDITS_BASE_URL"],
    api_key=os.environ["AICREDITS_API_KEY"],
)
 
# GPT-4o
response = client.chat.completions.create(
    model="openai/gpt-4o",
    messages=[{"role": "user", "content": "Explain async/await in Python."}],
)
 
# Claude 3.5 Sonnet — same SDK, same format
response = client.chat.completions.create(
    model="anthropic/claude-3-5-sonnet-20241022",
    messages=[{"role": "user", "content": "Refactor this function for readability."}],
)
 
# Gemini Flash — cheapest option
response = client.chat.completions.create(
    model="google/gemini-2.0-flash-001",
    messages=[{"role": "user", "content": "Classify this as spam or not spam."}],
)

The gateway routes each request to the correct provider and returns an OpenAI-compatible response.

Cost-aware routing pattern

# Route tasks to the best model for cost vs quality
ROUTING_TABLE = {
    "classify":   "openai/gpt-4o-mini",          # ₹14/M input — fast, cheap
    "summarise":  "anthropic/claude-3-5-haiku-20241022",  # ₹96/M — better quality
    "generate":   "anthropic/claude-3-5-sonnet-20241022", # ₹289/M — complex tasks
    "code":       "anthropic/claude-3-5-sonnet-20241022", # ₹289/M — best code model
}
 
def ask(task_type: str, prompt: str) -> str:
    model = ROUTING_TABLE.get(task_type, "openai/gpt-4o-mini")
    response = client.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": prompt}],
    )
    usage = response.usage
    print(f"[{task_type}] model={model} tokens={usage.prompt_tokens}+{usage.completion_tokens}")
    return response.choices[0].message.content
 
# Each task uses the appropriate model
label   = ask("classify", "Classify: 'My payment failed' → billing/technical/general")
summary = ask("summarise", "Summarise this 500-word article in 2 sentences: ...")
reply   = ask("generate", "Write a professional reply to this customer complaint: ...")

Fallback handling

AICredits handles provider-level failover automatically. For application-level fallback (fall back to a cheaper model after a timeout), wrap your call in a try/except:

from openai import APIStatusError
 
def resilient_ask(prompt: str) -> str:
    for model in ["anthropic/claude-3-5-sonnet-20241022", "openai/gpt-4o-mini"]:
        try:
            response = client.chat.completions.create(
                model=model,
                messages=[{"role": "user", "content": prompt}],
                timeout=10,
            )
            return response.choices[0].message.content
        except (APIStatusError, TimeoutError):
            continue
    raise RuntimeError("All models failed")

Log the model used, token counts, and response time for each request. This data feeds the cost analysis that helps you refine your routing strategy over time.

Related Articles

Continue in Docs

Need implementation commands and endpoint details? Go to quickstart or API reference.