Cogito V2 Preview Llama 109B

Image input Text input Text output
Author's Description

An instruction-tuned, hybrid-reasoning Mixture-of-Experts model built on Llama-4-Scout-17B-16E. Cogito v2 can answer directly or engage an extended “thinking” phase, with alignment guided by Iterated Distillation & Amplification (IDA). It targets coding, STEM, instruction following, and general helpfulness, with stronger multilingual, tool-calling, and reasoning performance than size-equivalent baselines. The model supports long-context use (up to 10M tokens) and standard Transformers workflows. Users can control the reasoning behaviour with the `reasoning` `enabled` boolean. [Learn more in our docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#enable-reasoning-with-default-config)

Key Specifications
Cost
$$
Context
32K
Parameters
109B
Released
Sep 02, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Include Reasoning Stop Max Tokens Tool Choice Top P Frequency Penalty Reasoning Logit Bias Min P Tools Temperature Presence Penalty
Features

This model supports the following features:

Reasoning Tools
Performance Summary

Cogito V2 Preview Llama 109B demonstrates exceptional speed, consistently ranking among the fastest models across all evaluated benchmarks. Its pricing strategy generally offers cost-effective solutions, placing it in the 66th percentile for affordability. While specific reliability metrics are not provided, the model's performance across benchmarks suggests a functional operational state. A significant strength is its perfect accuracy in the Hallucinations (Baseline) test, indicating a robust ability to acknowledge uncertainty and avoid generating fictional information. This makes it a highly reliable choice for applications where factual integrity is paramount. The model also shows promising performance in Reasoning, achieving 64.0% accuracy, which is competitive within its price point and speed category. However, the model exhibits critical weaknesses in several core areas. It scored 0.0% accuracy across Instruction Following, General Knowledge, Coding, Email Classification, and Ethics benchmarks. This suggests a substantial limitation in understanding and executing complex instructions, recalling factual information, performing coding tasks, categorizing emails, and navigating ethical dilemmas. These widespread zero-accuracy scores indicate that while the model can identify when it doesn't know something, it struggles significantly with tasks requiring specific knowledge, complex instruction adherence, or domain-specific understanding.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.18
Completion $0.59

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Together
Together | deepcogito/cogito-v2-preview-llama-109b-moe 32K $0.18 / 1M tokens $0.59 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by deepcogito