Author's Description
Tongyi DeepResearch is an agentic large language model developed by Tongyi Lab, with 30 billion total parameters activating only 3 billion per token. It's optimized for long-horizon, deep information-seeking tasks and delivers state-of-the-art performance on benchmarks like Humanity's Last Exam, BrowserComp, BrowserComp-ZH, WebWalkerQA, GAIA, xbench-DeepSearch, and FRAMES. This makes it superior for complex agentic search, reasoning, and multi-step problem-solving compared to prior models. The model includes a fully automated synthetic data pipeline for scalable pre-training, fine-tuning, and reinforcement learning. It uses large-scale continual pre-training on diverse agentic data to boost reasoning and stay fresh. It also features end-to-end on-policy RL with a customized Group Relative Policy Optimization, including token-level gradients and negative sample filtering for stable training. The model supports ReAct for core ability checks and an IterResearch-based 'Heavy' mode for max performance through test-time scaling. It's ideal for advanced research agents, tool use, and heavy inference workflows.
Key Specifications
Supported Parameters
This model supports the following parameters:
Features
This model supports the following features:
Performance Summary
Tongyi DeepResearch 30B A3B, developed by Alibaba, is an agentic large language model optimized for long-horizon, deep information-seeking tasks. While it tends to have longer response times, ranking in the 12th percentile for speed, its pricing is moderate, placing it in the 39th percentile. A significant strength is its strong reliability, boasting an 89% success rate. The model demonstrates exceptional performance in Coding (98th percentile accuracy) and strong capabilities in General Knowledge (99.0% accuracy) and Reasoning (93.6% accuracy). It also performs well in Email Classification with 99.0% accuracy. However, it exhibits notable weaknesses in Ethics (57.3% accuracy, 16th percentile) and Hallucinations (80.0% accuracy, 23rd percentile), indicating a need for improvement in these areas. Its Mathematics performance is also average at 68.1% accuracy. Overall, Tongyi DeepResearch excels in complex problem-solving and information retrieval, making it suitable for advanced research agents and heavy inference workflows, despite its slower processing speed and areas for ethical and hallucination mitigation.
Model Pricing
Current Pricing
| Feature | Price (per 1M tokens) |
|---|---|
| Prompt | $0.09 |
| Completion | $0.45 |
Price History
Available Endpoints
| Provider | Endpoint Name | Context Length | Pricing (Input) | Pricing (Output) |
|---|---|---|---|---|
|
AtlasCloud
|
AtlasCloud | alibaba/tongyi-deepresearch-30b-a3b | 131K | $0.09 / 1M tokens | $0.45 / 1M tokens |
|
NCompass
|
NCompass | alibaba/tongyi-deepresearch-30b-a3b | 131K | $0.09 / 1M tokens | $0.4 / 1M tokens |
|
Chutes
|
Chutes | alibaba/tongyi-deepresearch-30b-a3b | 131K | $0.09 / 1M tokens | $0.4 / 1M tokens |
Benchmark Results
| Benchmark | Category | Reasoning | Strategy | Free | Executions | Accuracy | Cost | Duration |
|---|