inclusionAI: Ling-2.6-flash

Text input Text output
Author's Description

Ling-2.6-flash is an instant (instruct) model from inclusionAI with 104B total parameters and 7.4B active parameters, designed for real-world agents that require fast responses, strong execution, and high token efficiency....

Key Specifications
Cost
$$
Context
262K
Parameters
104B (Rumoured)
Released
Apr 21, 2026
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Tool Choice Tools Response Format Temperature Max Tokens Structured Outputs Presence Penalty Stop Top P Frequency Penalty Seed
Features

This model supports the following features:

Structured Outputs Tools Response Format
Performance Summary

inclusionAI's Ling-2.6-flash model demonstrates strong performance in terms of operational efficiency, consistently ranking among the fastest models in its class (60th percentile for speed) and offering highly competitive pricing (89th percentile). However, its accuracy across various benchmarks indicates significant areas for improvement. The model struggles particularly with Instruction Following (12.0% accuracy), Email Classification (5th percentile), and Ethics (13th percentile), suggesting limitations in understanding complex directives and nuanced scenarios. While its General Knowledge (52.5%) and Hallucinations (50.0%) scores are moderate, they still fall within the lower percentiles. Performance in Mathematics, Reasoning, and Coding also remains low, hovering around the 37-38% accuracy range. Overall, Ling-2.6-flash's key strength lies in its cost-effectiveness and speed, making it potentially suitable for applications where rapid, low-cost responses are prioritized over high accuracy in complex tasks. Its primary weakness is its generally low accuracy across a broad spectrum of cognitive and practical benchmarks.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.08
Completion $0.24
Input Cache Read $0.016

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Novita
Novita | inclusionai/ling-2.6-flash-20260421 262K $0.08 / 1M tokens $0.24 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by inclusionai