Author's Description
Llemma 7B is a language model for mathematics. It was initialized with Code Llama 7B weights, and trained on the Proof-Pile-2 for 200B tokens. Llemma models are particularly strong at...
Key Specifications
Supported Parameters
This model supports the following parameters:
Performance Summary
EleutherAI's Llemma 7B, a mathematics-focused language model initialized from Code Llama 7B and trained on Proof-Pile-2, demonstrates exceptional speed and cost-efficiency. It consistently ranks among the fastest models and offers among the most competitive pricing across five benchmarks, achieving an "Infinityth percentile" in both categories. However, its performance on general benchmarks is notably low. In General Knowledge, it achieved only 7.5% accuracy (10th percentile), and in Ethics, 7.0% accuracy (8th percentile). Email Classification also showed limited capability with 8.0% accuracy (3rd percentile). A significant weakness is observed in Instruction Following, where it scored 0.0% accuracy, indicating a complete inability to follow complex directives. Coding performance was similarly low at 4.0% accuracy (10th percentile). These results suggest that while Llemma 7B excels in its core design for mathematical reasoning and tool use, its capabilities do not extend to broader general-purpose tasks. Its strengths are highly specialized, aligning with its description as a model for mathematics, particularly in chain-of-thought reasoning and computational tool integration.
Model Pricing
Current Pricing
| Feature | Price (per 1M tokens) |
|---|---|
| Prompt | $0.8 |
| Completion | $1.2 |
Price History
Available Endpoints
| Provider | Endpoint Name | Context Length | Pricing (Input) | Pricing (Output) |
|---|---|---|---|---|
|
Featherless
|
Featherless | eleutherai/llemma_7b | 4K | $0.8 / 1M tokens | $1.2 / 1M tokens |
Benchmark Results
| Benchmark | Category | Reasoning | Strategy | Free | Executions | Accuracy | Cost | Duration |
|---|