Google: Gemini 2.5 Flash Lite Preview 09-2025

File input Text input Image input Audio input Video input Text output
Author's Description

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better performance...

Key Specifications
Cost
$$$
Context
1M
Released
Sep 25, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Seed Tools Structured Outputs Top P Response Format Reasoning Temperature Stop Include Reasoning Tool Choice Max Tokens
Features

This model supports the following features:

Structured Outputs Response Format Tools Reasoning
Performance Summary

Google's Gemini 2.5 Flash Lite Preview 09-2025 is a lightweight reasoning model designed for ultra-low latency and cost efficiency. It consistently performs among the fastest models, ranking in the 80th percentile across benchmarks, and offers competitive pricing, typically in the 71st percentile. The model demonstrates exceptional reliability with a 99% success rate, indicating minimal technical failures. In terms of performance, Gemini 2.5 Flash Lite excels in General Knowledge (99.5% accuracy), Email Classification (99.0% accuracy), and achieves perfect accuracy in Ethics (100.0%). Its General Knowledge performance is particularly noteworthy for its speed. While optimized for speed by default, its Hallucinations accuracy (88.0%) is moderate, and its Coding (80.0%) and Mathematics (81.0%) accuracy are in the lower percentiles for those categories. Instruction Following (67.0%) and Reasoning (72.0%) show solid, mid-range performance. The model's strength lies in its ability to deliver high accuracy in knowledge-based and classification tasks at a rapid pace and competitive cost, making it suitable for applications where speed and efficiency are paramount, even if it means a trade-off in complex reasoning or specialized coding tasks without enabling multi-pass reasoning.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.1
Completion $0.4
Input Cache Read $0.01
Input Cache Write $0.0833
Internal Reasoning $0.4

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Google AI Studio
Google AI Studio | google/gemini-2.5-flash-lite-preview-09-2025 1M $0.1 / 1M tokens $0.4 / 1M tokens
Google
Google | google/gemini-2.5-flash-lite-preview-09-2025 1M $0.1 / 1M tokens $0.4 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by google