OpenGVLab: InternVL3 2B

Text input Image input Text output Unavailable
Author's Description

The 2b version of the InternVL3 series, for an even higher inference speed and very reasonable performance. An advanced multimodal large language model (MLLM) series that demonstrates superior overall performance. Compared to InternVL 2.5, InternVL3 exhibits superior multimodal perception and reasoning capabilities, while further extending its multimodal capabilities to encompass tool usage, GUI agents, industrial image analysis, 3D vision perception, and more.

Key Specifications
Cost
$$
Context
12K
Parameters
2B
Released
Apr 30, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Top P Temperature Max Tokens
Performance Summary

OpenGVLab's InternVL3 2B, created on April 30, 2025, is positioned as a high-speed, reasonably performing multimodal large language model. It consistently ranks among the fastest models, demonstrating exceptional inference speed. Furthermore, it offers competitive pricing, ranking in the 89th percentile across benchmarks. While excelling in speed and cost-efficiency, the model exhibits significant variability in performance across different categories. In Classification, specifically Email Classification, it achieved 79.0% accuracy, indicating a reasonable ability to categorize structured text. However, its performance in more complex cognitive tasks is notably weaker. The model scored 0.0% accuracy in Reasoning, suggesting a current inability to handle multi-step logical problems. Similarly, in Ethics and General Knowledge, it achieved only 27.0% and 15.0% accuracy respectively, placing it in the lower percentiles for these domains. The longer durations observed in these benchmarks (Ethics: 225814ms, General Knowledge: 439530ms) further highlight the challenges it faces with these complex tasks. Its core strength lies in its speed and cost-effectiveness, making it potentially suitable for high-throughput, less cognitively demanding classification tasks. However, its current limitations in reasoning, ethical understanding, and general knowledge are significant weaknesses that would need to be addressed for broader application.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.05
Completion $0.1

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Nineteen
Nineteen | opengvlab/internvl3-2b 12K $0.05 / 1M tokens $0.1 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Free Executions Accuracy Cost Duration
Other Models by opengvlab