MoonshotAI: Kimi VL A3B Thinking

Text input Image input Text output Free Option
Author's Description

Kimi-VL is a lightweight Mixture-of-Experts vision-language model that activates only 2.8B parameters per step while delivering strong performance on multimodal reasoning and long-context tasks. The Kimi-VL-A3B-Thinking variant, fine-tuned with chain-of-thought and reinforcement learning, excels in math and visual reasoning benchmarks like MathVision, MMMU, and MathVista, rivaling much larger models such as Qwen2.5-VL-7B and Gemma-3-12B. It supports 128K context and high-resolution input via its MoonViT encoder.

Key Specifications
Cost
$$$
Context
131K
Parameters
3B
Released
Apr 10, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Reasoning Include Reasoning Seed Top P Temperature Top Logprobs Logit Bias Logprobs Stop Min P Max Tokens Frequency Penalty Presence Penalty
Features

This model supports the following features:

Reasoning
Performance Summary

MoonshotAI's Kimi VL A3B Thinking model, a lightweight Mixture-of-Experts vision-language model, demonstrates exceptional speed and competitive pricing. It consistently ranks among the fastest models across all evaluated benchmarks and offers among the most competitive pricing across five benchmarks. With a 98% success rate across six benchmarks, its reliability is notably high, indicating consistent operational stability. In terms of performance, the model exhibits a significant strength in Ethics, achieving perfect accuracy and being highlighted as the most accurate model at its price point and among models of similar speed. Its General Knowledge is solid at 85% accuracy, placing it in the 26th percentile, while its Reasoning capabilities are moderate at 64% accuracy (51st percentile). Coding performance is fair at 78% accuracy (40th percentile). A notable weakness is observed in Instruction Following, where it scored 0.0% accuracy in both instances, suggesting a critical area for improvement in understanding and executing complex, multi-layered instructions. Despite this, its fine-tuning with chain-of-thought and reinforcement learning appears to contribute to its strong performance in specific reasoning and ethical tasks.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.02
Completion $0.07

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Chutes
Chutes | moonshotai/kimi-vl-a3b-thinking 131K $0.02 / 1M tokens $0.07 / 1M tokens
Chutes
Chutes | moonshotai/kimi-vl-a3b-thinking 131K $0.02 / 1M tokens $0.07 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by moonshotai