Author's Description
Virtuoso‑Large is Arcee's top‑tier general‑purpose LLM at 72 B parameters, tuned to tackle cross‑domain reasoning, creative writing and enterprise QA. Unlike many 70 B peers, it retains the 128 k context inherited from Qwen 2.5, letting it ingest books, codebases or financial filings wholesale. Training blended DeepSeek R1 distillation, multi‑epoch supervised fine‑tuning and a final DPO/RLHF alignment stage, yielding strong performance on BIG‑Bench‑Hard, GSM‑8K and long‑context Needle‑In‑Haystack tests. Enterprises use Virtuoso‑Large as the "fallback" brain in Conductor pipelines when other SLMs flag low confidence. Despite its size, aggressive KV‑cache optimizations keep first‑token latency in the low‑second range on 8× H100 nodes, making it a practical production‑grade powerhouse.
Key Specifications
Supported Parameters
This model supports the following parameters:
Features
This model supports the following features:
Performance Summary
Arcee AI's Virtuoso-Large is a 72B parameter general-purpose LLM designed for cross-domain reasoning, creative writing, and enterprise QA, leveraging a 128k context length inherited from Qwen 2.5. It consistently ranks among the fastest models, performing in the 92nd percentile across 8 benchmarks, and offers competitive pricing, placing in the 44th percentile. The model demonstrates exceptional reliability with a 99% success rate, indicating minimal technical failures. Virtuoso-Large exhibits outstanding performance in Hallucinations, Ethics, and Email Classification, achieving perfect accuracy in the former two and 99% in the latter, often being the most accurate at its price point or speed. It also shows strong capabilities in General Knowledge (99.5% accuracy) and Coding (84% accuracy). While its Instruction Following (63% accuracy) and Reasoning (74% accuracy) scores are respectable, they are not its strongest suits. Mathematics (79% accuracy) falls in the middle range. Its aggressive KV-cache optimizations contribute to low-second first-token latency on H100 nodes, making it a practical production-grade solution, particularly as a fallback in Conductor pipelines.
Model Pricing
Current Pricing
Feature | Price (per 1M tokens) |
---|---|
Prompt | $0.75 |
Completion | $1.2 |
Price History
Available Endpoints
Provider | Endpoint Name | Context Length | Pricing (Input) | Pricing (Output) |
---|---|---|---|---|
Together
|
Together | arcee-ai/virtuoso-large | 131K | $0.75 / 1M tokens | $1.2 / 1M tokens |
Benchmark Results
Benchmark | Category | Reasoning | Strategy | Free | Executions | Accuracy | Cost | Duration |
---|
Other Models by arcee-ai
|
Released | Params | Context |
|
Speed | Ability | Cost |
---|---|---|---|---|---|---|---|
Arcee AI: AFM 4.5B | Sep 16, 2025 | 5B | 65K |
Text input
Text output
|
★ | ★★ | $$$ |
Arcee AI: Caller Large Unavailable | May 05, 2025 | — | 32K |
Text input
Text output
|
★★★★★ | ★★★ | $$$$ |
Arcee AI: Spotlight | May 05, 2025 | ~7B | 131K |
Text input
Image input
Text output
|
★★★★★ | ★★★ | $$ |
Arcee AI: Maestro Reasoning | May 05, 2025 | ~32B | 131K |
Text input
Text output
|
★★ | ★★★★ | $$$$$ |
Arcee AI: Coder Large | May 05, 2025 | ~32B | 32K |
Text input
Text output
|
★★★★★ | ★★★★ | $$$ |
Arcee AI: Virtuoso Medium V2 Unavailable | May 05, 2025 | ~32B | 131K |
Text input
Text output
|
★★★★★ | ★★★★ | $$$ |
Arcee AI: Arcee Blitz Unavailable | May 05, 2025 | ~24B | 32K |
Text input
Text output
|
★★★★ | ★★★★ | $$$ |