Google: Gemini 1.5 Flash

Text input Image input Text output
Author's Description

Gemini 1.5 Flash is a foundation model that performs well at a variety of multimodal tasks such as visual understanding, classification, summarization, and creating content from image, audio and video. It's adept at processing visual and text inputs such as photographs, documents, infographics, and screenshots. Gemini 1.5 Flash is designed for high-volume, high-frequency tasks where cost and latency matter. On most common tasks, Flash achieves comparable quality to other Gemini Pro models at a significantly reduced cost. Flash is well-suited for applications like chat assistants and on-demand content generation where speed and scale matter. Usage of Gemini is subject to Google's [Gemini Terms of Use](https://ai.google.dev/terms). #multimodal

Key Specifications
Cost
$$
Context
1M
Parameters
500B (Rumoured)
Released
May 13, 2024
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Stop Presence Penalty Tool Choice Top P Temperature Seed Tools Structured Outputs Response Format Frequency Penalty Max Tokens
Features

This model supports the following features:

Tools Structured Outputs Response Format
Performance Summary

Google's Gemini 1.5 Flash demonstrates exceptional performance across various metrics, positioning itself as a highly efficient and cost-effective multimodal AI model. It consistently ranks among the fastest models available, achieving an Infinityth percentile in speed across all benchmarks. Similarly, its pricing is remarkably competitive, also securing an Infinityth percentile ranking for cost-efficiency. The model exhibits outstanding reliability, with a 100th percentile ranking, indicating minimal technical failures and consistent, usable responses. In terms of specific benchmark performance, Gemini 1.5 Flash shows strong capabilities. It achieved 100% accuracy in Email Classification, making it the most accurate model at its price point and among models of comparable speed. It also performed well in General Knowledge (97.5% accuracy) and Ethics (98.0% accuracy). While its Coding (Baseline) accuracy was 84.0%, it showed a notable duration of 135887ms. A significant weakness was observed in Instruction Following, where it registered 0.0% accuracy, suggesting this is an area for improvement. Its Reasoning capabilities are solid at 74.0% accuracy. Overall, Gemini 1.5 Flash is well-suited for high-volume, high-frequency tasks where speed, cost, and reliability are paramount, particularly for applications like chat assistants and content generation.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.075
Completion $0.3
Input Cache Read $0.0188
Input Cache Write $0.158

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Google
Google | google/gemini-flash-1.5 1M $0.075 / 1M tokens $0.3 / 1M tokens
Google AI Studio
Google AI Studio | google/gemini-flash-1.5 1M $0.075 / 1M tokens $0.3 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Free Executions Accuracy Cost Duration
Other Models by google