NVIDIA Models with INR Pricing

NVIDIA: Llama 3.3 Nemotron Super 49B V1.5NVIDIA

nvidia/llama-3.3-nemotron-super-49b-v1.5

Llama-3.3-Nemotron-Super-49B-v1.5 is a 49B-parameter, English-centric reasoning/chat model derived from Meta’s Llama-3.3-70B-Instruct with a 128K context. It’s post-trained for agentic workflows (RAG, tool calling) via SFT across math, code, science, and...

Chat API

Context

131K

Input

Input from ₹9.92/1M

Cached input

Cached ₹0.99/1M

Output

Output ₹39.69/1M

View details

NVIDIA: Nemotron 3 Nano 30B A3BNVIDIA

nvidia/nemotron-3-nano-30b-a3b

NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest compute efficiency and accuracy for developers to build specialized agentic AI systems. The model is fully...

Chat API

Context

262K

Input

Input from ₹4.96/1M

Cached input

Cached ₹0.50/1M

Output

Output ₹19.84/1M

View details

nvidia/nemotron-3-nano-omni-30b-a3bNVIDIA

nvidia/nemotron-3-nano-omni-30b-a3b

Chat API

Context

262K

Input

Input from ₹19.84/1M

Cached input

Cached ₹1.98/1M

Output

Output ₹79.37/1M

View details

NVIDIA: Nemotron 3 SuperNVIDIA

nvidia/nemotron-3-super-120b-a12b

NVIDIA Nemotron 3 Super is a 120B-parameter open hybrid MoE model, activating just 12B parameters for maximum compute efficiency and accuracy in complex multi-agent applications. Built on a hybrid Mamba-Transformer...

Chat API

Context

1.0M

Input

Input from ₹9.92/1M

Cached input

Cached ₹0.89/1M

Output

Output ₹49.61/1M

View details

NVIDIA: Nemotron 3 UltraNVIDIA

nvidia/nemotron-3-ultra-550b-a55b

NVIDIA Nemotron 3 Ultra is an open frontier-reasoning and orchestration model from NVIDIA, with 55B active parameters out of 550B total (MoE). Built on a hybrid Transformer-Mamba mixture-of-experts architecture, it...

Chat API

Context

1.0M

Input

Input from ₹49.61/1M

Output

Output ₹248.04/1M

View details

nvidia/nemotron-nano-12b-v2NVIDIA

nvidia/nemotron-nano-12b-v2

Chat API

Context

131K

Input

Input from ₹5.95/1M

Cached input

Cached ₹0.60/1M

Output

Output ₹5.95/1M

View details

NVIDIA: Nemotron Nano 9B V2NVIDIA

nvidia/nemotron-nano-9b-v2

NVIDIA-Nemotron-Nano-9B-v2 is a large language model (LLM) trained from scratch by NVIDIA, and designed as a unified model for both reasoning and non-reasoning tasks. It responds to user queries and...

Chat API

Context

131K

Input

Input from ₹3.97/1M

Cached input

Cached ₹0.40/1M

Output

Output ₹15.87/1M

View details