Google: Gemini 2.5 Flash Lite Preview 09-2025

Image input Audio input Video input Text input File input Text output
Author's Description

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better performance across common benchmarks compared to earlier Flash models. By default, "thinking" (i.e. multi-pass reasoning) is disabled to prioritize speed, but developers can enable it via the [Reasoning API parameter](https://openrouter.ai/docs/use-cases/reasoning-tokens) to selectively trade off cost for intelligence.

Key Specifications
Cost
$$$
Context
1M
Released
Sep 25, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Reasoning Stop Tools Top P Response Format Tool Choice Temperature Seed Include Reasoning Structured Outputs Max Tokens
Features

This model supports the following features:

Response Format Reasoning Structured Outputs Tools
Performance Summary

Google's Gemini 2.5 Flash Lite Preview 09-2025 is a lightweight reasoning model designed for ultra-low latency and cost efficiency. It performs among the fastest models, ranking in the 79th percentile for speed, and offers competitive pricing, placing in the 68th percentile. The model demonstrates exceptional reliability with a 99% success rate, indicating minimal technical failures. In terms of performance across benchmarks, Gemini 2.5 Flash Lite excels in several areas. It achieved perfect accuracy in Ethics, also being the most accurate model at its price point and among models of similar speed. Its General Knowledge is also a significant strength, scoring 99.5% accuracy and being the most accurate among models this fast. Email Classification also shows strong performance at 99.0% accuracy. While its Instruction Following is solid at 67.0% accuracy, its Hallucinations (88.0% accuracy) and Coding (80.0% accuracy) benchmarks indicate areas for potential improvement, particularly given its speed. Reasoning and Mathematics scores are moderate, suggesting that while "thinking" is disabled by default for speed, enabling it via the Reasoning API could enhance performance in these complex tasks.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.1
Completion $0.4

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Google AI Studio
Google AI Studio | google/gemini-2.5-flash-lite-preview-09-2025 1M $0.1 / 1M tokens $0.4 / 1M tokens
Google
Google | google/gemini-2.5-flash-lite-preview-09-2025 1M $0.1 / 1M tokens $0.4 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by google