TNG: DeepSeek R1T2 Chimera

Text input Text output Free Option
Author's Description

DeepSeek-TNG-R1T2-Chimera is the second-generation Chimera model from TNG Tech. It is a 671 B-parameter mixture-of-experts text-generation model assembled from DeepSeek-AI’s R1-0528, R1, and V3-0324 checkpoints with an Assembly-of-Experts merge. The tri-parent design yields strong reasoning performance while running roughly 20 % faster than the original R1 and more than 2× faster than R1-0528 under vLLM, giving a favorable cost-to-intelligence trade-off. The checkpoint supports contexts up to 60 k tokens in standard use (tested to ~130 k) and maintains consistent <think> token behaviour, making it suitable for long-context analysis, dialogue and other open-ended generation tasks.

Key Specifications
Cost
$$$$
Context
163K
Parameters
1T
Released
Jul 08, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Reasoning Include Reasoning Seed Top P Temperature Top Logprobs Logit Bias Logprobs Stop Min P Max Tokens Frequency Penalty Presence Penalty
Features

This model supports the following features:

Reasoning
Performance Summary

DeepSeek-TNG-R1T2-Chimera, TNG Tech's second-generation 671B-parameter mixture-of-experts model, demonstrates exceptional performance across several key metrics. It consistently ranks among the fastest models available, achieving an Infinityth percentile speed ranking across five benchmarks, and offers highly competitive pricing, also at an Infinityth percentile. The model exhibits outstanding reliability with a 100% success rate across all benchmarks, indicating minimal technical failures and consistent response generation. In terms of benchmark performance, the model achieved perfect 100% accuracy in General Knowledge, Email Classification, and Ethics, often being the most accurate model at its price point and speed. Its Coding performance was also strong at 95% accuracy, placing it in the 96th percentile. However, a significant weakness is observed in Instruction Following, where it recorded 0.0% accuracy, suggesting a critical area for improvement. Despite this, its tri-parent design contributes to strong reasoning and a favorable cost-to-intelligence trade-off, running significantly faster than its predecessors. Its ability to handle contexts up to 60k tokens (tested to ~130k) and consistent `<think>` token behavior make it suitable for long-context analysis and open-ended generation tasks.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.25
Completion $1

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Chutes
Chutes | tngtech/deepseek-r1t2-chimera 163K $0.25 / 1M tokens $1 / 1M tokens
Chutes
Chutes | tngtech/deepseek-r1t2-chimera 163K $0.25 / 1M tokens $1 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by tngtech