Prime Intellect: INTELLECT-3

Text input Text output
Author's Description

INTELLECT-3 is a 106B-parameter Mixture-of-Experts model (12B active) post-trained from GLM-4.5-Air-Base using supervised fine-tuning (SFT) followed by large-scale reinforcement learning (RL). It offers state-of-the-art performance for its size across math, code, science, and general reasoning, consistently outperforming many larger frontier models. Designed for strong multi-step problem solving, it maintains high accuracy on structured tasks while remaining efficient at inference thanks to its MoE architecture.

Key Specifications
Cost
$$$$$
Context
131K
Parameters
106B (Rumoured)
Released
Nov 26, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Include Reasoning Temperature Tools Reasoning Max Tokens Presence Penalty Structured Outputs Frequency Penalty Response Format Tool Choice Top P
Features

This model supports the following features:

Response Format Tools Structured Outputs Reasoning
Performance Summary

Prime Intellect's INTELLECT-3, a 106B-parameter Mixture-of-Experts model, demonstrates exceptional performance across several key metrics. It consistently ranks among the fastest models, achieving an Infinityth percentile across 8 benchmarks, and offers highly competitive pricing, also at the Infinityth percentile. The model exhibits strong reliability with a 97% success rate across 8 benchmarks, indicating minimal technical failures. In terms of specific benchmarks, INTELLECT-3 shows remarkable strength in Mathematics, achieving 96.0% accuracy (95th percentile), and strong performance in Coding (91.0% accuracy, 71st percentile). General Knowledge and Ethics also show high accuracy at 98.0% for both. However, the model struggles significantly with tasks requiring it to acknowledge uncertainty, scoring 0.0% on the Hallucinations (Baseline) test, indicating it does not appropriately select "I don't know" for fictional concepts. Similarly, its Instruction Following capability is 0.0%, suggesting a fundamental issue with complex instruction adherence. Reasoning accuracy is moderate at 52.0% (32nd percentile). While its MoE architecture aims for efficiency, the duration for some benchmarks, particularly Mathematics and Reasoning, is quite high.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.2
Completion $1.1

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Nebius
Nebius | prime-intellect/intellect-3-20251126 131K $0.2 / 1M tokens $1.1 / 1M tokens
Parasail
Parasail | prime-intellect/intellect-3-20251126 131K $0.2 / 1M tokens $1.1 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration