Author's Description
[Microsoft Research](/microsoft) Phi-4 is designed to perform well in complex reasoning tasks and can operate efficiently in situations with limited memory or where quick responses are needed. At 14 billion parameters, it was trained on a mix of high-quality synthetic datasets, data from curated websites, and academic materials. It has undergone careful improvement to follow instructions accurately and maintain strong safety standards. It works best with English language inputs. For more information, please see [Phi-4 Technical Report](https://arxiv.org/pdf/2412.08905)
Key Specifications
Supported Parameters
This model supports the following parameters:
Features
This model supports the following features:
Performance Summary
Microsoft's Phi-4, a 14-billion parameter model, demonstrates a strong overall performance profile, particularly excelling in cost-efficiency and reliability. It typically performs among the fastest models, ranking in the 61st percentile for speed across five benchmarks. Consistently offering among the most competitive pricing, Phi-4 achieves an 85th percentile ranking for cost. The model exhibits exceptional reliability, boasting a 100% success rate across all benchmarks, indicating minimal technical failures and consistent provision of usable responses. In terms of specific benchmark performance, Phi-4 shows a notable strength in ethical reasoning, achieving a perfect 100% accuracy in the Ethics (Baseline) benchmark, making it the most accurate model at its price point and among models of similar speed. It also performs well in acknowledging uncertainty, with a 96.0% accuracy in the Hallucinations (Baseline) test, effectively identifying fictional concepts. General knowledge is solid at 96.8% accuracy. However, the model shows a weakness in complex reasoning tasks, scoring 50.0% accuracy in the Reasoning (Baseline) benchmark, placing it in the lower 34th percentile. Email classification also presents an area for improvement, with 94.0% accuracy, ranking in the 30th percentile. Its key strengths lie in its ethical understanding, reliability, and cost-effectiveness, while complex reasoning and specific classification tasks represent areas for further development.
Model Pricing
Current Pricing
Feature | Price (per 1M tokens) |
---|---|
Prompt | $0.07 |
Completion | $0.14 |
Price History
Available Endpoints
Provider | Endpoint Name | Context Length | Pricing (Input) | Pricing (Output) |
---|---|---|---|---|
DeepInfra
|
DeepInfra | microsoft/phi-4 | 16K | $0.07 / 1M tokens | $0.14 / 1M tokens |
Nebius
|
Nebius | microsoft/phi-4 | 16K | $0.06 / 1M tokens | $0.14 / 1M tokens |
NextBit
|
NextBit | microsoft/phi-4 | 16K | $0.06 / 1M tokens | $0.14 / 1M tokens |
Benchmark Results
Benchmark | Category | Reasoning | Strategy | Free | Executions | Accuracy | Cost | Duration |
---|
Other Models by microsoft
|
Released | Params | Context |
|
Speed | Ability | Cost |
---|---|---|---|---|---|---|---|
Microsoft: Phi 4 Reasoning Plus | May 01, 2025 | ~14B | 32K |
Text input
Text output
|
★ | ★★★ | $$$$ |
Microsoft: MAI DS R1 | Apr 20, 2025 | — | 163K |
Text input
Text output
|
★★★★ | ★★★★★ | $$$ |
Microsoft: Phi 4 Multimodal Instruct | Mar 07, 2025 | ~5.6B | 131K |
Text input
Image input
Text output
|
★★ | ★★ | $$ |
Microsoft: Phi-3.5 Mini 128K Instruct | Aug 20, 2024 | ~3.8B | 128K |
Text input
Text output
|
★ | ★★ | $$ |
Microsoft: Phi-3 Mini 128K Instruct | May 25, 2024 | ~3.8B | 128K |
Text input
Text output
|
★★★ | ★★ | $$ |
Microsoft: Phi-3 Medium 128K Instruct | May 23, 2024 | ~14B | 128K |
Text input
Text output
|
★★ | ★ | $$$$ |
WizardLM-2 8x22B | Apr 15, 2024 | 22B | 65K |
Text input
Text output
|
★★★ | ★★ | $$$ |