MoonshotAI: Kimi VL A3B Thinking

Text input Image input Text output Free Option
Author's Description

Kimi-VL is a lightweight Mixture-of-Experts vision-language model that activates only 2.8B parameters per step while delivering strong performance on multimodal reasoning and long-context tasks. The Kimi-VL-A3B-Thinking variant, fine-tuned with chain-of-thought and reinforcement learning, excels in math and visual reasoning benchmarks like MathVision, MMMU, and MathVista, rivaling much larger models such as Qwen2.5-VL-7B and Gemma-3-12B. It supports 128K context and high-resolution input via its MoonViT encoder.

Key Specifications
Cost
$$
Context
131K
Parameters
3B
Released
Apr 10, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Stop Presence Penalty Logit Bias Temperature Seed Frequency Penalty Max Tokens Include Reasoning Top P Min P Reasoning Logprobs Top Logprobs
Features

This model supports the following features:

Reasoning
Performance Summary

Moonshot AI's Kimi VL A3B Thinking model consistently ranks among the fastest models and offers highly competitive pricing, demonstrating exceptional reliability with a 99% success rate. In terms of performance, the model exhibits a significant strength in Ethics, achieving perfect accuracy and proving to be the most accurate model at its price point and speed. This highlights its robust ethical reasoning capabilities. However, the model shows a critical weakness in Instruction Following, scoring 0.0% accuracy across two separate benchmarks, indicating a fundamental limitation in processing and executing complex instructions. In other areas, Kimi VL A3B Thinking delivers moderate performance. It achieves 78.0% accuracy in Coding and 63.3% in Reasoning, placing it around the middle of the pack for these categories. Its General Knowledge score of 85.0% is respectable but falls into the lower quartile. Overall, while excelling in ethical considerations and offering strong value in terms of speed and cost, the model's inability to follow instructions is a major drawback that needs addressing.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.025
Completion $0.1

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Chutes
Chutes | moonshotai/kimi-vl-a3b-thinking 131K $0.025 / 1M tokens $0.1 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Free Executions Accuracy Cost Duration
Other Models by moonshotai