Author's Description
Caller Large is Arcee's specialist "function‑calling" SLM built to orchestrate external tools and APIs. Instead of maximizing next‑token accuracy, training focuses on structured JSON outputs, parameter extraction and multi‑step tool chains, making Caller a natural choice for retrieval‑augmented generation, robotic process automation or data‑pull chatbots. It incorporates a routing head that decides when (and how) to invoke a tool versus answering directly, reducing hallucinated calls. The model is already the backbone of Arcee Conductor's auto‑tool mode, where it parses user intent, emits clean function signatures and hands control back once the tool response is ready. Developers thus gain an OpenAI‑style function‑calling UX without handing requests to a frontier‑scale model.
Key Specifications
Supported Parameters
This model supports the following parameters:
Features
This model supports the following features:
Performance Summary
Arcee AI: Caller Large is a specialized function-calling SLM designed for orchestrating external tools and APIs, prioritizing structured JSON outputs and multi-step tool chains over next-token accuracy. The model performs among the fastest models, typically ranking in the top tier for speed (76th percentile across 5 benchmarks). It also offers competitive pricing, falling around the 50th percentile across benchmarks. Notably, Caller Large demonstrates strong reliability with a 95% success rate, indicating it consistently provides usable responses with few technical issues. In terms of benchmark performance, Caller Large exhibits a mixed profile. While it achieves a respectable 90.5% accuracy in General Knowledge, this places it in the 31st percentile, suggesting it's not a top performer in broad factual recall. Similarly, its 97.0% accuracy in Ethics is solid but ranks in the 34th percentile. A significant weakness is observed in Email Classification, where its 87.0% accuracy places it in the 11th percentile, indicating challenges in this specific classification task. Instruction Following shows moderate performance at 53.1% accuracy (55th percentile), while Coding is also a weaker area at 60.0% accuracy (25th percentile). Its core strength lies in its intended purpose: function-calling and structured output generation, which is not directly measured by these general benchmarks but is highlighted by its design and integration into Arcee Conductor. Its speed and reliability are key advantages, making it suitable for applications where consistent, fast tool orchestration is paramount.
Model Pricing
Current Pricing
Feature | Price (per 1M tokens) |
---|---|
Prompt | $0.55 |
Completion | $0.85 |
Price History
Available Endpoints
Provider | Endpoint Name | Context Length | Pricing (Input) | Pricing (Output) |
---|---|---|---|---|
Together
|
Together | arcee-ai/caller-large | 32K | $0.55 / 1M tokens | $0.85 / 1M tokens |
Benchmark Results
Benchmark | Category | Reasoning | Strategy | Free | Executions | Accuracy | Cost | Duration |
---|
Other Models by arcee-ai
|
Released | Params | Context |
|
Speed | Ability | Cost |
---|---|---|---|---|---|---|---|
Arcee AI: AFM 4.5B | Sep 16, 2025 | 5B | 65K |
Text input
Text output
|
★ | ★★ | $$$ |
Arcee AI: Spotlight | May 05, 2025 | ~7B | 131K |
Text input
Image input
Text output
|
★★★★★ | ★★★ | $$ |
Arcee AI: Maestro Reasoning | May 05, 2025 | ~32B | 131K |
Text input
Text output
|
★★ | ★★★★ | $$$$$ |
Arcee AI: Virtuoso Large | May 05, 2025 | ~72B | 131K |
Text input
Text output
|
★★★★★ | ★★★★ | $$$$ |
Arcee AI: Coder Large | May 05, 2025 | ~32B | 32K |
Text input
Text output
|
★★★★★ | ★★★★ | $$$ |
Arcee AI: Virtuoso Medium V2 Unavailable | May 05, 2025 | ~32B | 131K |
Text input
Text output
|
★★★★★ | ★★★★ | $$$ |
Arcee AI: Arcee Blitz Unavailable | May 05, 2025 | ~24B | 32K |
Text input
Text output
|
★★★★ | ★★★★ | $$$ |