Author's Description
LongCat-Flash-Chat is a large-scale Mixture-of-Experts (MoE) model with 560B total parameters, of which 18.6B–31.3B (≈27B on average) are dynamically activated per input. It introduces a shortcut-connected MoE design to reduce communication overhead and achieve high throughput while maintaining training stability through advanced scaling strategies such as hyperparameter transfer, deterministic computation, and multi-stage optimization. This release, LongCat-Flash-Chat, is a non-thinking foundation model optimized for conversational and agentic tasks. It supports long context windows up to 128K tokens and shows competitive performance across reasoning, coding, instruction following, and domain benchmarks, with particular strengths in tool use and complex multi-step interactions.
Key Specifications
Supported Parameters
This model supports the following parameters:
Performance Summary
LongCat-Flash-Chat, Meituan's 560B parameter MoE model, demonstrates a strong overall performance profile, particularly excelling in reliability. With a 99% success rate across benchmarks, it consistently provides usable responses, indicating exceptional stability. In terms of speed, the model exhibits moderate performance, ranking in the 29th percentile, suggesting it is not among the fastest but offers acceptable latency. Cost-wise, LongCat-Flash-Chat is generally cost-effective, placing in the 62nd percentile. The model showcases remarkable accuracy in foundational knowledge and ethical reasoning, achieving perfect scores in both General Knowledge and Ethics benchmarks, often being the most accurate at its price point and speed. It also performs very well in Email Classification and Instruction Following, securing 88th percentile accuracy in both. Its reasoning capabilities are strong, with an 80% accuracy, placing it in the 87th percentile. While its Coding performance is solid at 84% accuracy, it ranks lower at the 60th percentile compared to other models. Key strengths include its high reliability, exceptional accuracy in knowledge-based and ethical tasks, and strong instruction following and reasoning. No significant weaknesses are apparent, though its speed is moderate rather than leading.
Model Pricing
Current Pricing
Feature | Price (per 1M tokens) |
---|---|
Prompt | $0.15 |
Completion | $0.75 |
Price History
Available Endpoints
Provider | Endpoint Name | Context Length | Pricing (Input) | Pricing (Output) |
---|---|---|---|---|
AtlasCloud
|
AtlasCloud | meituan/longcat-flash-chat | 131K | $0.15 / 1M tokens | $0.75 / 1M tokens |
Chutes
|
Chutes | meituan/longcat-flash-chat | 131K | $0.25 / 1M tokens | $1 / 1M tokens |
Benchmark Results
Benchmark | Category | Reasoning | Strategy | Free | Executions | Accuracy | Cost | Duration |
---|