Build a LangChain Agent That Costs ₹0.02 Per Run

A practical guide to building a cost-efficient LangChain agent in Python using affordable models available in India, with real INR cost breakdowns per tool call.

Author

AICredits Team

Published

28 Feb 2026

Reading time

8 min read

Why agent cost matters

A single LangChain agent run can involve 3–8 LLM calls depending on tool use and self-correction loops. At GPT-4o prices, that adds up to ₹6–20 per run — fine for prototypes, painful at 10,000 runs per day.

The key insight is that most agent steps do not need a frontier model. Routing, tool selection, and parameter extraction are well within the capability of sub-₹1-per-million-token models.

Setup: connect LangChain to AICredits

pip install langchain langchain-openai

from langchain_openai import ChatOpenAI
 
# Use any model via AICredits — change only base_url and api_key
llm = ChatOpenAI(
    model="anthropic/claude-3-5-haiku-20241022",
    base_url="https://api.aicredits.in/v1",
    api_key="sk-your-aicredits-key",
    temperature=0,
)

The rest of your LangChain code is unchanged. Tools, chains, agents, memory — all work as normal.

Choosing the right model for each agent step

Use Claude 3.5 Haiku or GPT-4o Mini for tool selection and parameter extraction — these steps are structured and predictable. Use Claude 3.5 Sonnet or GPT-4o only for the final synthesis step where response quality visibly matters to your user.

from langchain.agents import create_tool_calling_agent, AgentExecutor
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.tools import tool
 
# Cheap model for tool selection steps
cheap_llm = ChatOpenAI(
    model="openai/gpt-4o-mini",
    base_url="https://api.aicredits.in/v1",
    api_key="sk-your-aicredits-key",
)
 
# Better model for final answer synthesis
smart_llm = ChatOpenAI(
    model="anthropic/claude-3-5-haiku-20241022",
    base_url="https://api.aicredits.in/v1",
    api_key="sk-your-aicredits-key",
)
 
@tool
def get_weather(city: str) -> str:
    """Get current weather for a city."""
    return f"Weather in {city}: 28°C, partly cloudy"
 
@tool
def calculate(expression: str) -> str:
    """Evaluate a mathematical expression."""
    return str(eval(expression))  # simplified example
 
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant. Use tools when needed."),
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}"),
])
 
agent = create_tool_calling_agent(cheap_llm, [get_weather, calculate], prompt)
executor = AgentExecutor(agent=agent, tools=[get_weather, calculate], verbose=True)
 
result = executor.invoke({"input": "What is the weather in Mumbai? Also, what is 15% of 2400?"})
print(result["output"])

Hitting ₹0.02 per run with the right model

For many agentic tasks — web search summarisation, document Q&A, simple data extraction — GPT-4o Mini handles all steps adequately. At $0.15 per million input tokens (₹14.4 in INR), a 5-step agent with 300 tokens per call costs roughly:

5 steps × 300 tokens × ₹14.4 / 1,000,000 = ₹0.02 per agent run

The pattern: profile your agent on 50 sample inputs, measure task success rate with GPT-4o Mini, and only escalate individual failing steps to a more capable model. Most tasks land between ₹0.02 and ₹0.20 per run.

Cost tracking per agent run

AICredits logs every LLM call with model, token counts, and INR cost. Export a CSV from the Usage tab in the dashboard to see per-request breakdowns.

Tag your requests by passing a unique identifier in the user field of the API call. This lets you filter the usage CSV by agent type and attribute cost to specific product features.

Using the Anthropic SDK with AICredits (Python & TypeScript)

7 min read

The Prompting Cheat Sheet: 10 Patterns Every Developer Should Know

9 min read

How to Get Structured JSON Output from Any LLM (Reliably)