Author's Description
Hermes 4 is a large-scale reasoning model built on Meta-Llama-3.1-405B and released by Nous Research. It introduces a hybrid reasoning mode, where the model can choose to deliberate internally with <think>...</think> traces or respond directly, offering flexibility between speed and depth. Users can control the reasoning behaviour with the `reasoning` `enabled` boolean. [Learn more in our docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#enable-reasoning-with-default-config) The model is instruction-tuned with an expanded post-training corpus (~60B tokens) emphasizing reasoning traces, improving performance in math, code, STEM, and logical reasoning, while retaining broad assistant utility. It also supports structured outputs, including JSON mode, schema adherence, function calling, and tool use. Hermes 4 is trained for steerability, lower refusal rates, and alignment toward neutral, user-directed behavior.
Key Specifications
Supported Parameters
This model supports the following parameters:
Features
This model supports the following features:
Performance Summary
Nous: Hermes 4 405B, a large-scale reasoning model built on Meta-Llama-3.1-405B, demonstrates competitive performance across various benchmarks. It exhibits competitive response times, ranking in the 59th percentile for speed, and offers moderate pricing, placing it in the 34th percentile. A standout feature is its exceptional reliability, achieving a 100% success rate across all benchmarks, indicating consistent and usable responses. The model excels in specific areas, achieving perfect accuracy in Ethics (Baseline) and near-perfect scores in Email Classification (99.0%) and General Knowledge (99.5%). These results highlight its strong capabilities in ethical reasoning, classification tasks, and broad factual recall. Its expanded post-training corpus, emphasizing reasoning traces, appears to contribute to its strong performance in these domains. While demonstrating solid performance in Coding (83.0% accuracy), its Instruction Following (61.0%) and Reasoning (60.0%) scores, though respectable, suggest areas for potential improvement compared to its top-tier performance in other categories. The introduction of a hybrid reasoning mode with internal deliberation offers flexibility, and its support for structured outputs like JSON mode and function calling enhances its utility for developers.
Model Pricing
Current Pricing
Feature | Price (per 1M tokens) |
---|---|
Prompt | $1 |
Completion | $3 |
Price History
Available Endpoints
Provider | Endpoint Name | Context Length | Pricing (Input) | Pricing (Output) |
---|---|---|---|---|
Nebius
|
Nebius | nousresearch/hermes-4-405b | 131K | $1 / 1M tokens | $3 / 1M tokens |
Chutes
|
Chutes | nousresearch/hermes-4-405b | 131K | $0.2 / 1M tokens | $0.8 / 1M tokens |
Benchmark Results
Benchmark | Category | Reasoning | Free | Executions | Accuracy | Cost | Duration |
---|
Other Models by nousresearch
|
Released | Params | Context |
|
Speed | Ability | Cost |
---|---|---|---|---|---|---|---|
Nous: Hermes 4 70B | Aug 26, 2025 | 70B | 131K |
Text input
Text output
|
★★★★ | ★★★ | $$ |
Nous: DeepHermes 3 Mistral 24B Preview | May 09, 2025 | 24B | 32K |
Text input
Text output
|
★★★★ | ★★★ | $$ |
Nous: Hermes 3 70B Instruct | Aug 17, 2024 | 70B | 131K |
Text input
Text output
|
★★★★ | ★★★★ | $$ |
Nous: Hermes 3 405B Instruct | Aug 15, 2024 | 405B | 131K |
Text input
Text output
|
★★★ | ★★★★ | $$$$ |
NousResearch: Hermes 2 Pro - Llama-3 8B | May 26, 2024 | 8B | 131K |
Text input
Text output
|
★★★★★ | ★★★ | $ |
Nous: Hermes 2 Mixtral 8x7B DPO Unavailable | Jan 15, 2024 | 56B | 32K |
Text input
Text output
|
★★★ | ★★ | $$$$ |