Google: Gemini 2.5 Flash Lite

File input Text input Image input Audio input Video input Text output
Author's Description

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better performance...

Key Specifications
Cost
$$$
Context
1M
Released
Jul 22, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Seed Tools Structured Outputs Top P Response Format Reasoning Temperature Stop Include Reasoning Tool Choice Max Tokens
Features

This model supports the following features:

Structured Outputs Response Format Tools Reasoning
Performance Summary

Gemini 2.5 Flash Lite, a lightweight reasoning model, consistently performs among the fastest models, ranking in the 89th percentile across benchmarks. It offers competitive pricing, typically providing cost-effective solutions in the 72nd percentile. Notably, the model demonstrates exceptional reliability with a 100% success rate across all evaluated benchmarks, indicating minimal technical failures. In terms of performance across categories, Gemini 2.5 Flash Lite excels in Email Classification and Ethics, achieving perfect 100% accuracy in both, and is recognized as the most accurate model at its price point and among models of similar speed. It also shows strong performance in General Knowledge (99.0% accuracy), being the most accurate among models this fast. While its Hallucinations accuracy is 90.0%, it exhibits moderate performance in Instruction Following (58.0%), Reasoning (62.0%), Coding (79.0%), and Mathematics (77.0%). Its primary strength lies in its speed, reliability, and cost-effectiveness for tasks requiring high accuracy in classification and ethical considerations. Its main weakness appears to be in more complex reasoning and mathematical problem-solving, where its accuracy is lower compared to its top-tier performance in other areas.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.1
Completion $0.4
Input Cache Read $0.01
Input Cache Write $0.0833
Internal Reasoning $0.4

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Google
Google | google/gemini-2.5-flash-lite 1M $0.1 / 1M tokens $0.4 / 1M tokens
Google AI Studio
Google AI Studio | google/gemini-2.5-flash-lite 1M $0.1 / 1M tokens $0.4 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by google