StepFun: Step 3.7 Flash

Video input Image input Text input Text output
Author's Description

Step 3.7 Flash is StepFun's latest high-efficiency multimodal Mixture-of-Experts model. It pairs a 196B-parameter language backbone with a vision encoder for native image and video understanding, activating roughly 11B parameters...

Key Specifications
Cost
$$$
Context
256K
Parameters
196B (Rumoured)
Released
May 28, 2026
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Response Format Include Reasoning Temperature Top Logprobs Tools Max Tokens Reasoning Stop Frequency Penalty Logprobs Structured Outputs Top P
Features

This model supports the following features:

Structured Outputs Reasoning Tools Response Format
Performance Summary

StepFun's Step 3.7 Flash, a 196B-parameter multimodal Mixture-of-Experts model, demonstrates exceptional performance across various metrics. It consistently ranks in the top tier for speed, placing in the 80th percentile across 8 benchmarks, indicating it performs among the fastest models available. The model also offers competitive pricing, typically providing cost-effective solutions in the 79th percentile. A standout feature is its perfect reliability, achieving a 100% success rate across all 8 benchmarks, signifying minimal technical failures and consistent, usable responses. In terms of benchmark performance, Step 3.7 Flash achieves perfect 100% accuracy in Hallucinations, Instruction Following, General Knowledge, Coding, Email Classification, Ethics, and Mathematics, often being the most accurate model at its price point and speed. This highlights its robust capabilities in understanding and generating content, following complex instructions, and demonstrating broad knowledge. The only notable area for improvement is Reasoning, where it achieved 50% accuracy, placing it in the 27th percentile. Despite this, its overall accuracy and efficiency in other critical areas, coupled with its multimodal capabilities for native image and video understanding, position Step 3.7 Flash as a highly capable and reliable AI solution.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.2
Completion $1.15
Input Cache Read $0.04

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
StepFun
StepFun | stepfun/step-3.7-flash-20260528 256K $0.2 / 1M tokens $1.15 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by stepfun