AllenAI: Olmo 3.1 32B Think

Text input Text output Unavailable
Author's Description

Olmo 3.1 32B Think is a large-scale, 32-billion-parameter model designed for deep reasoning, complex multi-step logic, and advanced instruction following. Building on the Olmo 3 series, version 3.1 delivers refined reasoning behavior and stronger performance across demanding evaluations and nuanced conversational tasks. Developed by Ai2 under the Apache 2.0 license, Olmo 3.1 32B Think continues the Olmo initiative’s commitment to openness, providing full transparency across model weights, code, and training methodology.

Key Specifications
Cost
$$$$$
Context
65K
Parameters
32B
Released
Dec 16, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Seed Frequency Penalty Structured Outputs Top P Response Format Reasoning Temperature Stop Presence Penalty Include Reasoning Max Tokens Logit Bias
Features

This model supports the following features:

Structured Outputs Response Format Reasoning
Performance Summary

AllenAI: Olmo 3.1 32B Think, a 32-billion-parameter model designed for deep reasoning and complex instruction following, demonstrates a strong overall performance profile. While its speed ranking indicates it tends to have longer response times, placing it in the 14th percentile, its pricing is moderate, falling within the 31st percentile. A significant strength is its exceptional reliability, boasting a 99% success rate across benchmarks, ensuring consistent and usable responses. The model excels in several key areas. It shows robust performance in Reasoning (90.0% accuracy, 74th percentile) and Coding (91.0% accuracy, 69th percentile), indicating its proficiency in complex problem-solving and programming tasks. Its General Knowledge is also impressive at 99.0% accuracy (62nd percentile). Instruction Following is solid at 64.6% accuracy (65th percentile). A notable weakness appears in its handling of Hallucinations, where its 91.5% accuracy (43rd percentile) suggests room for improvement in acknowledging uncertainty. Ethics (98.0% accuracy, 37th percentile) and Email Classification (96.8% accuracy, 38th percentile) are competent but not top-tier. Mathematics performance is average at 88.4% accuracy (51st percentile). Overall, Olmo 3.1 32B Think is a reliable model with strong reasoning and coding capabilities, though its speed and hallucination control could be enhanced.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.15
Completion $0.5

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Parasail
Parasail | allenai/olmo-3.1-32b-think-20251215 65K $0.15 / 1M tokens $0.5 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by allenai