StepFun: Step 3.7 Flash

Name: StepFun: Step 3.7 Flash
Brand: stepfun
Price: 2e-7 USD
Availability: InStock
Rating: 4.3 (8 reviews)

Back

Image input Text input Video input Text output

Author's Description

Step 3.7 Flash is StepFun's latest high-efficiency multimodal Mixture-of-Experts model. It pairs a 196B-parameter language backbone with a vision encoder for native image and video understanding, activating roughly 11B parameters...

Key Specifications

Cost

$$$

Context

256K

Parameters

196B (Rumoured)

Released

May 28, 2026

Speed

★★★★★

Ability

★★★★★

Reliability

★★★★★

Hugging Face

Supported Parameters

This model supports the following parameters:

Stop Max Tokens Structured Outputs Reasoning Top P Frequency Penalty Temperature Top Logprobs Include Reasoning Logprobs Tools Tool Choice Response Format

Features

This model supports the following features:

Response Format Tools Structured Outputs Reasoning

Performance Summary

StepFun's Step 3.7 Flash, a 196B-parameter multimodal Mixture-of-Experts model, demonstrates exceptional performance across various metrics. It consistently ranks in the top tier for speed, placing in the 80th percentile across 8 benchmarks, indicating it performs among the fastest models available. The model also offers competitive pricing, typically providing cost-effective solutions in the 79th percentile. A standout feature is its perfect reliability, achieving a 100% success rate across all 8 benchmarks, signifying minimal technical failures and consistent, usable responses. In terms of benchmark performance, Step 3.7 Flash achieves perfect 100% accuracy in Hallucinations, Instruction Following, General Knowledge, Coding, Email Classification, Ethics, and Mathematics, often being the most accurate model at its price point and speed. This highlights its robust capabilities in understanding and generating content, following complex instructions, and demonstrating broad knowledge. The only notable area for improvement is Reasoning, where it achieved 50% accuracy, placing it in the 27th percentile. Despite this, its overall accuracy and efficiency in other critical areas, coupled with its multimodal capabilities for native image and video understanding, position Step 3.7 Flash as a highly capable and reliable AI solution.

Model Pricing

Current Pricing

Feature	Price (per 1M tokens)
Prompt	$0.2
Completion	$1.15
Input Cache Read	$0.04

Price History

Available Endpoints

Provider	Endpoint Name	Context Length	Pricing (Input)	Pricing (Output)
StepFun	StepFun \| stepfun/step-3.7-flash-20260528	256K	$0.2 / 1M tokens	$1.15 / 1M tokens
DeepInfra	DeepInfra \| stepfun/step-3.7-flash-20260528	262K	$0.2 / 1M tokens	$1.15 / 1M tokens
Novita	Novita \| stepfun/step-3.7-flash-20260528	262K	$0.2 / 1M tokens	$1.15 / 1M tokens
Novita	Novita \| stepfun/step-3.7-flash-20260528	262K	$0.2 / 1M tokens	$1.15 / 1M tokens

Benchmark Results

Benchmark	Category	Reasoning	Strategy	Free	Executions	Accuracy	Cost	Duration

Other Models by stepfun

	Released	Params	Context	Filter by Modalities All Modalities	Speed	Ability	Cost
StepFun: Step 3.7 Flash Unavailable	May 28, 2026	~196B	256K	Image input Text input Video input Text output	★★★★★	★★	$$
StepFun: Step 3.5 Flash	Jan 29, 2026	~196B	256K	Text input Text output	★★★	★★★★★	$$$

StepFun: Step 3.7 Flash

Author's Description

Key Specifications

Supported Parameters

Features

Performance Summary

Model Pricing

Current Pricing

Price History

Available Endpoints

Benchmark Results

Other Models by stepfun

Your Privacy Matters