Meituan: LongCat Flash Chat

Text input Text output
Author's Description

LongCat-Flash-Chat is a large-scale Mixture-of-Experts (MoE) model with 560B total parameters, of which 18.6B–31.3B (≈27B on average) are dynamically activated per input. It introduces a shortcut-connected MoE design to reduce communication overhead and achieve high throughput while maintaining training stability through advanced scaling strategies such as hyperparameter transfer, deterministic computation, and multi-stage optimization. This release, LongCat-Flash-Chat, is a non-thinking foundation model optimized for conversational and agentic tasks. It supports long context windows up to 128K tokens and shows competitive performance across reasoning, coding, instruction following, and domain benchmarks, with particular strengths in tool use and complex multi-step interactions.

Key Specifications
Cost
$$$
Context
131K
Parameters
560B (Rumoured)
Released
Sep 09, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Top P Max Tokens Temperature
Performance Summary

LongCat-Flash-Chat, Meituan's 560B parameter MoE model, demonstrates a strong overall performance profile, particularly excelling in reliability. With a 99% success rate across benchmarks, it consistently provides usable responses, indicating exceptional stability. In terms of speed, the model exhibits moderate performance, ranking in the 29th percentile, suggesting it is not among the fastest but offers acceptable latency. Cost-wise, LongCat-Flash-Chat is generally cost-effective, placing in the 62nd percentile. The model showcases remarkable accuracy in foundational knowledge and ethical reasoning, achieving perfect scores in both General Knowledge and Ethics benchmarks, often being the most accurate at its price point and speed. It also performs very well in Email Classification and Instruction Following, securing 88th percentile accuracy in both. Its reasoning capabilities are strong, with an 80% accuracy, placing it in the 87th percentile. While its Coding performance is solid at 84% accuracy, it ranks lower at the 60th percentile compared to other models. Key strengths include its high reliability, exceptional accuracy in knowledge-based and ethical tasks, and strong instruction following and reasoning. No significant weaknesses are apparent, though its speed is moderate rather than leading.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.15
Completion $0.75

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
AtlasCloud
AtlasCloud | meituan/longcat-flash-chat 131K $0.15 / 1M tokens $0.75 / 1M tokens
Chutes
Chutes | meituan/longcat-flash-chat 131K $0.25 / 1M tokens $1 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration