Google: Gemma 3 4B

Text input Image input Text output Free Option
Author's Description

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities, including structured outputs and function calling.

Key Specifications
Cost
$$
Context
131K
Parameters
4B
Released
Mar 13, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Response Format Stop Seed Min P Top P Max Tokens Frequency Penalty Temperature Presence Penalty
Features

This model supports the following features:

Response Format
Performance Summary

Google's Gemma 3 4B demonstrates competitive response times, performing among the faster models with a 59th percentile speed ranking. It stands out for its exceptional price competitiveness, consistently offering among the most affordable options, ranking in the 92nd percentile. The model also exhibits strong reliability, with a 91% success rate across benchmarks, indicating consistent operational stability. In terms of performance across categories, Gemma 3 4B shows a mixed profile. Its primary strength lies in Ethics, achieving 96.0% accuracy, and it also performs reasonably well in General Knowledge (69.8%) and Coding (66.0%). However, a significant weakness is its high hallucination rate, with only 2.0% accuracy in the Hallucinations (Baseline) test, suggesting a tendency to generate incorrect information rather than acknowledge uncertainty. Performance in Instruction Following (10.1% accuracy), Email Classification (77.0% accuracy), and Reasoning (36.0% accuracy) is also relatively low. While its Mathematics accuracy is 68.0%, this places it in the 33rd percentile, indicating room for improvement compared to other models. The model's multimodality, supporting vision-language input, and advanced features like structured outputs and function calling, are notable capabilities not directly reflected in these specific benchmarks.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.04
Completion $0.08

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
DeepInfra
DeepInfra | google/gemma-3-4b-it 131K $0.04 / 1M tokens $0.08 / 1M tokens
NextBit
NextBit | google/gemma-3-4b-it 131K $0.04 / 1M tokens $0.08 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by google