Google: Gemini 3.1 Flash Lite Preview

File input Image input Audio input Text input Video input Text output
Author's Description

Gemini 3.1 Flash Lite Preview is Google's high-efficiency model optimized for high-volume use cases. It outperforms Gemini 2.5 Flash Lite on overall quality and approaches Gemini 2.5 Flash performance across key capabilities. Improvements span audio input/ASR, RAG snippet ranking, translation, data extraction, and code completion. Supports full thinking levels (minimal, low, medium, high) for fine-grained cost/performance trade-offs. Priced at half the cost of Gemini 3 Flash.

Key Specifications
Cost
$$$
Context
1M
Released
Mar 02, 2026
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Tools Response Format Structured Outputs Temperature Top P Tool Choice Max Tokens Seed Stop Reasoning Include Reasoning
Features

This model supports the following features:

Reasoning Structured Outputs Tools Response Format
Performance Summary

Google's Gemini 3.1 Flash Lite Preview, created on March 2, 2026, is a high-efficiency model designed for high-volume applications. It consistently performs among the fastest models, ranking in the 91st percentile across 8 benchmarks, and offers competitive pricing, typically providing cost-effective solutions at the 61st percentile. The model demonstrates exceptional reliability with a 100% success rate across all benchmarks, indicating minimal technical failures. Key strengths include perfect accuracy in Hallucinations, General Knowledge, and Ethics benchmarks, often achieving these results at competitive price points and impressive speeds. It also excels in Mathematics (96% accuracy, 97th percentile) and Email Classification (99% accuracy, 88th percentile). The model shows strong performance in Instruction Following (76% accuracy, 84th percentile) and Reasoning (82% accuracy, 68th percentile), frequently being the most accurate among models of comparable speed. While its Coding accuracy (87%, 57th percentile) is solid, it's not a standout compared to its other top-tier performances. Overall, Gemini 3.1 Flash Lite Preview delivers a compelling balance of speed, cost-efficiency, and high accuracy across a broad range of critical AI capabilities, making it suitable for diverse high-volume use cases.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.25
Completion $1.5
Input Cache Read $0.025
Input Cache Write $0.0833
Internal Reasoning $1.5

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Google
Google | google/gemini-3.1-flash-lite-preview-20260303 1M $0.25 / 1M tokens $1.5 / 1M tokens
Google AI Studio
Google AI Studio | google/gemini-3.1-flash-lite-preview-20260303 1M $0.25 / 1M tokens $1.5 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by google