EleutherAI: Llemma 7b

Text input Text output
Author's Description

Llemma 7B is a language model for mathematics. It was initialized with Code Llama 7B weights, and trained on the Proof-Pile-2 for 200B tokens. Llemma models are particularly strong at chain-of-thought mathematical reasoning and using computational tools for mathematics, such as Python and formal theorem provers.

Key Specifications
Cost
$$$$
Context
4K
Parameters
7B
Released
Apr 14, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Stop Seed Min P Top P Max Tokens Frequency Penalty Temperature Presence Penalty
Performance Summary

EleutherAI's Llemma 7B, a language model specialized in mathematics, consistently ranks among the fastest models and offers highly competitive pricing across five benchmarks. Created on April 14, 2025, and initialized with Code Llama 7B weights, it was further trained on the Proof-Pile-2 for 200B tokens, aiming for strength in mathematical reasoning and tool use. Despite its specialized training, Llemma 7B demonstrates significant weaknesses in general-purpose benchmarks. Its accuracy is notably low across all evaluated categories: General Knowledge (7.5%), Ethics (7.0%), Email Classification (8.0%), and Coding (4.0%). The model completely failed the Instruction Following benchmark with 0.0% accuracy. While its cost per evaluation is generally competitive, its duration for these tasks is often in the lowest percentiles, indicating slow processing relative to its low accuracy. This suggests that while the model is fast in general terms, its performance on these specific tasks is inefficient. The provided data does not include reliability ranking information, so no comment can be made on that aspect. Llemma 7B's key strength lies in its foundational training for mathematical reasoning and tool use, which is not directly reflected in these general benchmarks. Its notable weakness is its poor performance on broad-spectrum tasks, indicating a highly specialized, rather than general-purpose, capability.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.8
Completion $1.2

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Featherless
Featherless | eleutherai/llemma_7b 4K $0.8 / 1M tokens $1.2 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration