Microsoft: Phi 4 Mini Instruct

Text input Text output
Author's Description

Phi-4-mini-instruct is a lightweight open model built upon synthetic data and filtered publicly available websites - with a focus on high-quality, reasoning dense data. The model belongs to the Phi-4...

Key Specifications
Cost
$$
Context
128K
Released
Oct 17, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Response Format Temperature Max Tokens Structured Outputs Presence Penalty Stop Top P Frequency Penalty Seed
Features

This model supports the following features:

Structured Outputs Response Format
Performance Summary

Microsoft's Phi-4-mini-instruct demonstrates moderate speed performance, ranking in the 34th percentile across benchmarks. It offers cost-effective solutions, typically providing competitive pricing in the 79th percentile. The model exhibits strong reliability with an 86% success rate, indicating consistent and usable responses. Analysis of benchmark results reveals a mixed performance profile. The model excels in speed for Hallucinations (Baseline), achieving the fastest duration, though its accuracy in this area is low at 68%. However, its accuracy across other critical categories is notably low. It struggles significantly with Email Classification (4% accuracy), Reasoning (12% accuracy), Ethics (52% accuracy), and General Knowledge (58% accuracy). Instruction Following and Coding also show limited accuracy at 27% and 38% respectively. While its cost-effectiveness is a strength, particularly in Hallucinations, Instruction Following, and Reasoning, the model's overall accuracy across most benchmarks suggests significant limitations in its current capabilities for complex tasks.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.08
Completion $0.35
Input Cache Read $0.08

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
WandB
WandB | microsoft/phi-4-mini-instruct 128K $0.08 / 1M tokens $0.35 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by microsoft