Microsoft: Phi-3 Medium 128K Instruct

Name: Microsoft: Phi-3 Medium 128K Instruct
Brand: microsoft
Availability: OutOfStock
Rating: 1.2 (6 reviews)

Back

Text input Text output Unavailable

Author's Description

Phi-3 128K Medium is a powerful 14-billion parameter model designed for advanced language understanding, reasoning, and instruction following. Optimized through supervised fine-tuning and preference adjustments, it excels in tasks involving common sense, mathematics, logical reasoning, and code processing. At time of release, Phi-3 Medium demonstrated state-of-the-art performance among lightweight models. In the MMLU-Pro eval, the model even comes close to a Llama3 70B level of performance. For 4k context length, try [Phi-3 Medium 4K](/models/microsoft/phi-3-medium-4k-instruct).

Key Specifications

Cost

$$$$

Context

128K

Parameters

14B (Rumoured)

Released

May 23, 2024

Speed

★★★

Ability

★

Reliability

★

Hugging Face

Supported Parameters

This model supports the following parameters:

Tools Temperature Top P Tool Choice Max Tokens

Features

This model supports the following features:

Tools

Performance Summary

Microsoft's Phi-3 Medium 128K Instruct, released on May 23, 2024, is positioned as a powerful 14-billion parameter model optimized for advanced language understanding and reasoning. It consistently ranks among the fastest models and offers highly competitive pricing across all evaluated benchmarks. Despite its strong foundational claims and reported performance close to Llama3 70B in MMLU-Pro, the provided benchmark results indicate significant limitations in its current evaluated performance. The model achieved 0.0% accuracy across General Knowledge, Ethics, Instruction Following, Reasoning, and Coding benchmarks, suggesting a fundamental challenge in accurately addressing these diverse tasks. In Email Classification, it achieved a low 4.0% accuracy, placing it in the 4th percentile. While its speed and cost efficiency are exceptional, the lack of accurate responses across all tested categories is a critical weakness. This suggests that while the model is highly efficient in processing, its ability to generate correct or relevant outputs for these specific benchmarks is currently underdeveloped. Further evaluation with more detailed metrics beyond baseline accuracy would be beneficial to understand its true capabilities and identify areas for improvement.

Model Pricing

Current Pricing

Feature	Price (per 1M tokens)
Prompt	$1
Completion	$1

Price History

Available Endpoints

Provider	Endpoint Name	Context Length	Pricing (Input)	Pricing (Output)
Azure	Azure \| microsoft/phi-3-medium-128k-instruct	128K	$1 / 1M tokens	$1 / 1M tokens

Benchmark Results

Benchmark	Category	Reasoning	Strategy	Free	Executions	Accuracy	Cost	Duration

Other Models by microsoft

	Released	Params	Context	Filter by Modalities All Modalities	Speed	Ability	Cost
Microsoft: Phi 4 Reasoning Plus Unavailable	May 01, 2025	~14B	32K	Text input Text output	★	★★★	$$$$
Microsoft: MAI DS R1 Unavailable	Apr 20, 2025	—	163K	Text input Text output	★★★★	★★★★★	$$$
Microsoft: Phi 4 Multimodal Instruct Unavailable	Mar 07, 2025	~5.6B	131K	Image input Text input Text output	★★	★★	$$
Microsoft: Phi 4	Jan 09, 2025	~14B	16K	Text input Text output	★★★★	★★★★	$$
Microsoft: Phi-3.5 Mini 128K Instruct Unavailable	Aug 20, 2024	~3.8B	128K	Text input Text output	★	★	$$
Microsoft: Phi-3 Mini 128K Instruct Unavailable	May 25, 2024	~3.8B	128K	Text input Text output	★★★	★★	$$
WizardLM-2 8x22B	Apr 15, 2024	22B	65K	Text input Text output	★★★	★★	$$$