Microsoft: Phi-3.5 Mini 128K Instruct

Text input Text output
Author's Description

Phi-3.5 models are lightweight, state-of-the-art open models. These models were trained with Phi-3 datasets that include both synthetic data and the filtered, publicly available websites data, with a focus on high quality and reasoning-dense properties. Phi-3.5 Mini uses 3.8B parameters, and is a dense decoder-only transformer model using the same tokenizer as [Phi-3 Mini](/models/microsoft/phi-3-mini-128k-instruct). The models underwent a rigorous enhancement process, incorporating both supervised fine-tuning, proximal policy optimization, and direct preference optimization to ensure precise instruction adherence and robust safety measures. When assessed against benchmarks that test common sense, language understanding, math, code, long context and logical reasoning, Phi-3.5 models showcased robust and state-of-the-art performance among models with less than 13 billion parameters.

Key Specifications
Cost
$$
Context
128K
Parameters
3.8B (Rumoured)
Released
Aug 20, 2024
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Tools Tool Choice Top P Max Tokens Temperature
Features

This model supports the following features:

Tools
Performance Summary

Microsoft's Phi-3.5 Mini 128K Instruct model demonstrates exceptional speed, consistently ranking among the fastest models across various benchmarks. It also offers highly competitive pricing, positioning it as a cost-effective solution. With an 83% success rate across 7 benchmarks, the model exhibits strong reliability, consistently providing usable responses. In terms of performance across categories, Phi-3.5 Mini 128K Instruct shows a notable strength in Ethics, achieving perfect accuracy and being the most accurate model at its price point and speed. This highlights its robust ethical reasoning capabilities. However, the model struggles significantly in Instruction Following, achieving 0.0% accuracy, indicating a critical area for improvement. Its performance in General Knowledge (68.3%), Mathematics (30.2%), Reasoning (37.5%), and Coding (22.0%) is generally below average compared to other models. Email Classification shows moderate performance at 89.0% accuracy. The model's key strengths lie in its speed, cost-efficiency, and ethical understanding, while its primary weaknesses are in complex instruction following, coding, and advanced mathematical/reasoning tasks.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.1
Completion $0.1

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Azure
Azure | microsoft/phi-3.5-mini-128k-instruct 128K $0.1 / 1M tokens $0.1 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by microsoft