DeepSeek: DeepSeek V3 Base

Text input Text output Unavailable
Author's Description

Note that this is a base model mostly meant for testing, you need to provide detailed prompts for the model to return useful responses. DeepSeek-V3 Base is a 671B parameter open Mixture-of-Experts (MoE) language model with 37B active parameters per forward pass and a context length of 128K tokens. Trained on 14.8T tokens using FP8 mixed precision, it achieves high training efficiency and stability, with strong performance across language, reasoning, math, and coding tasks. DeepSeek-V3 Base is the pre-trained model behind [DeepSeek V3](/deepseek/deepseek-chat-v3)

Key Specifications
Cost
$$$
Context
163K
Parameters
671B (Rumoured)
Released
Mar 29, 2025
Speed
Ability
Reliability
Supported Parameters

This model supports the following parameters:

Top Logprobs Stop Logprobs Max Tokens Top P Frequency Penalty Logit Bias Min P Seed Temperature Presence Penalty
Performance Summary

DeepSeek V3 Base, a 671B parameter open Mixture-of-Experts (MoE) model, demonstrates exceptional speed and competitive pricing. It consistently ranks among the fastest models and offers highly competitive pricing across all evaluated benchmarks. Created on March 29, 2025, with a substantial context length of 163840, this base model is designed for testing and requires detailed prompts for optimal performance. In terms of benchmark performance, DeepSeek V3 Base exhibits varying capabilities. Its highest accuracy was observed in Coding (24.1%) and General Knowledge (17.6%), placing it in the 18th and 13th percentile respectively. However, performance in Ethics (13.0% accuracy) and Email Classification (10.0% accuracy) was notably lower, ranking in the 11th and 4th percentile. A significant weakness is evident in Instruction Following, where it achieved 0.0% accuracy. While its cost efficiency is generally good, particularly in General Knowledge and Ethics, its duration for tasks like Ethics and Email Classification is quite high, indicating slower processing times despite its overall speed ranking. This suggests that while the model is fast, its accuracy on certain tasks needs improvement.

Model Pricing

Current Pricing

Feature Price (per 1M tokens)
Prompt $0.2
Completion $0.8

Price History

Available Endpoints
Provider Endpoint Name Context Length Pricing (Input) Pricing (Output)
Chutes
Chutes | deepseek/deepseek-v3-base 163K $0.2 / 1M tokens $0.8 / 1M tokens
Benchmark Results
Benchmark Category Reasoning Strategy Free Executions Accuracy Cost Duration
Other Models by deepseek