Google: Gemini 2.5 Flash Lite Preview 06-17

Text input Image input File input Audio input Text output
Author's Description

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better performance across common benchmarks compared to earlier Flash models. By default, "thinking" (i.e. multi-pass reasoning) is disabled to prioritize speed, but developers can enable it via the [Reasoning API parameter](https://openrouter.ai/docs/use-cases/reasoning-tokens) to selectively trade off cost for intelligence.

Key Specifications
Cost
$$$
Context
1M
Released
Jun 17, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Tools Structured Outputs Tool Choice Reasoning Include Reasoning Response Format Stop Seed Top P Max Tokens Temperature
Features

This model supports the following features:

Tools Reasoning Response Format Structured Outputs
Performance Summary

Google's Gemini 2.5 Flash Lite Preview 06-17 demonstrates strong performance as a lightweight, cost-efficient reasoning model. It consistently ranks among the fastest models, placing in the 81st percentile across benchmarks, and offers competitive pricing, typically falling within the 66th percentile. Notably, the model exhibits exceptional reliability with a 100% success rate across all evaluated benchmarks, indicating minimal technical failures. In terms of specific performance, Gemini 2.5 Flash Lite excels in Email Classification (99.0% accuracy, 89th percentile) and General Knowledge (98.5% accuracy, 62nd percentile), showcasing its proficiency in structured data processing and broad factual recall. Its ability to handle hallucinations is also strong, with 94.0% accuracy. While its Instruction Following (60.0% accuracy, 64th percentile) and Ethics (99.0% accuracy, 59th percentile) are solid, areas like Mathematics (76.0% accuracy, 37th percentile) and Coding (78.5% accuracy, 40th percentile) present notable weaknesses, suggesting limitations in complex problem-solving within these domains when "thinking" is disabled. The model's core strength lies in its speed and cost-effectiveness for tasks where rapid, reliable responses are paramount, with the option to enable multi-pass reasoning for more complex challenges.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.1
Completion $0.4
Input Cache Read $0.025
Input Cache Write $0.183

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Google
Google | google/gemini-2.5-flash-lite-preview-06-17 1M $0.1 / 1M tokens $0.4 / 1M tokens
Google AI Studio
Google AI Studio | google/gemini-2.5-flash-lite-preview-06-17 1M $0.1 / 1M tokens $0.4 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by google