Sarvam AI: Sarvam-M

Text input Text output Unavailable
Author's Description

Sarvam-M is a 24 B-parameter, instruction-tuned derivative of Mistral-Small-3.1-24B-Base-2503, post-trained on English plus eleven major Indic languages (bn, hi, kn, gu, mr, ml, or, pa, ta, te). The model introduces a dual-mode interface: “non-think” for low-latency chat and a optional “think” phase that exposes chain-of-thought tokens for more demanding reasoning, math, and coding tasks. Benchmark reports show solid gains versus similarly sized open models on Indic-language QA, GSM-8K math, and SWE-Bench coding, making Sarvam-M a practical general-purpose choice for multilingual conversational agents as well as analytical workloads that mix English, native Indic scripts, or romanized text.

Key Specifications
Cost
$$
Context
32K
Parameters
24B (Rumoured)
Released
May 25, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Seed Frequency Penalty Top P Logprobs Min P Temperature Stop Presence Penalty Max Tokens Logit Bias Top Logprobs
Performance Summary

Sarvam-M, a 24B-parameter instruction-tuned model from sarvamai, demonstrates strong performance across several key metrics. It consistently ranks among the fastest models and offers highly competitive pricing, making it an efficient and cost-effective choice. The model exhibits exceptional reliability with a 100% success rate across all benchmarks, indicating a robust and stable operational profile. In terms of benchmark performance, Sarvam-M achieves high accuracy in General Knowledge (99.8%, 85th percentile) and Ethics (99.0%, 63rd percentile). It excels in Email Classification, achieving perfect 100.0% accuracy, positioning it as a top performer and the most accurate model at its price point and speed. The model also shows solid capabilities in Coding with 92.9% accuracy (82nd percentile). A notable weakness is its 0.0% accuracy in Instruction Following, suggesting this is an area requiring significant improvement. The dual-mode interface, offering both low-latency chat and a "think" phase for complex tasks, combined with its strong Indic language support, positions Sarvam-M as a practical general-purpose model for multilingual conversational and analytical workloads.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.022
Completion $0.022

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Chutes
Chutes | sarvamai/sarvam-m 32K $0.022 / 1M tokens $0.022 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration