Google: Gemma 3n 4B

Text input Text output Free Option
Author's Description

Gemma 3n E4B-it is optimized for efficient execution on mobile and low-resource devices, such as phones, laptops, and tablets. It supports multimodal inputs—including text, visual data, and audio—enabling diverse tasks such as text generation, speech recognition, translation, and image analysis. Leveraging innovations like Per-Layer Embedding (PLE) caching and the MatFormer architecture, Gemma 3n dynamically manages memory usage and computational load by selectively activating model parameters, significantly reducing runtime resource requirements. This model supports a wide linguistic range (trained in over 140 languages) and features a flexible 32K token context window. Gemma 3n can selectively load parameters, optimizing memory and computational efficiency based on the task or device capabilities, making it well-suited for privacy-focused, offline-capable applications and on-device AI solutions. [Read more in the blog post](https://developers.googleblog.com/en/introducing-gemma-3n/)

Key Specifications
Cost
$
Context
32K
Parameters
4B
Released
May 20, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Logit Bias Stop Min P Top P Max Tokens Frequency Penalty Temperature Presence Penalty
Performance Summary

Gemma 3n 4B demonstrates competitive response times, ranking in the 49th percentile across various benchmarks. It consistently offers among the most competitive pricing, placing in the 94th percentile. The model exhibits exceptional reliability with a 98% success rate, indicating minimal technical failures. In terms of performance across categories, Gemma 3n 4B shows strong capabilities in Email Classification, achieving 99.0% accuracy and being recognized as the most accurate model at its price point. It also performs well in Ethics (98.0% accuracy) and General Knowledge (95.0% accuracy). However, the model struggles significantly with Instruction Following, achieving only 2.0% accuracy, and shows moderate performance in Reasoning (58.0% accuracy) and Coding (71.0% accuracy). Its key strengths lie in its cost-effectiveness, high reliability, and strong performance in classification and knowledge-based tasks. The primary weakness is its limited ability to follow complex instructions.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.02
Completion $0.04

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Together
Together | google/gemma-3n-e4b-it 32K $0.02 / 1M tokens $0.04 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by google