Arcee AI: Caller Large

Text input Text output Unavailable
Author's Description

Caller Large is Arcee's specialist "function‑calling" SLM built to orchestrate external tools and APIs. Instead of maximizing next‑token accuracy, training focuses on structured JSON outputs, parameter extraction and multi‑step tool chains, making Caller a natural choice for retrieval‑augmented generation, robotic process automation or data‑pull chatbots. It incorporates a routing head that decides when (and how) to invoke a tool versus answering directly, reducing hallucinated calls. The model is already the backbone of Arcee Conductor's auto‑tool mode, where it parses user intent, emits clean function signatures and hands control back once the tool response is ready. Developers thus gain an OpenAI‑style function‑calling UX without handing requests to a frontier‑scale model.

Key Specifications
Cost
$$$$
Context
32K
Released
May 05, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Tools Logit Bias Tool Choice Response Format Stop Min P Top P Max Tokens Frequency Penalty Temperature Presence Penalty
Features

This model supports the following features:

Tools Response Format
Performance Summary

Arcee AI: Caller Large is a specialized function-calling SLM designed for orchestrating external tools and APIs, prioritizing structured JSON outputs and multi-step tool chains over next-token accuracy. The model performs among the fastest models, typically ranking in the top tier for speed (76th percentile across 5 benchmarks). It also offers competitive pricing, falling around the 50th percentile across benchmarks. Notably, Caller Large demonstrates strong reliability with a 95% success rate, indicating it consistently provides usable responses with few technical issues. In terms of benchmark performance, Caller Large exhibits a mixed profile. While it achieves a respectable 90.5% accuracy in General Knowledge, this places it in the 31st percentile, suggesting it's not a top performer in broad factual recall. Similarly, its 97.0% accuracy in Ethics is solid but ranks in the 34th percentile. A significant weakness is observed in Email Classification, where its 87.0% accuracy places it in the 11th percentile, indicating challenges in this specific classification task. Instruction Following shows moderate performance at 53.1% accuracy (55th percentile), while Coding is also a weaker area at 60.0% accuracy (25th percentile). Its core strength lies in its intended purpose: function-calling and structured output generation, which is not directly measured by these general benchmarks but is highlighted by its design and integration into Arcee Conductor. Its speed and reliability are key advantages, making it suitable for applications where consistent, fast tool orchestration is paramount.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.55
Completion $0.85

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Together
Together | arcee-ai/caller-large 32K $0.55 / 1M tokens $0.85 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by arcee-ai