DeepSeek: DeepSeek V3 Base

Name: DeepSeek: DeepSeek V3 Base
Brand: deepseek
Availability: OutOfStock
Rating: 1.4 (5 reviews)

Back

Text input Text output Unavailable

Author's Description

Note that this is a base model mostly meant for testing, you need to provide detailed prompts for the model to return useful responses. DeepSeek-V3 Base is a 671B parameter open Mixture-of-Experts (MoE) language model with 37B active parameters per forward pass and a context length of 128K tokens. Trained on 14.8T tokens using FP8 mixed precision, it achieves high training efficiency and stability, with strong performance across language, reasoning, math, and coding tasks. DeepSeek-V3 Base is the pre-trained model behind [DeepSeek V3](/deepseek/deepseek-chat-v3)

Key Specifications

Cost

$$$

Context

163K

Parameters

671B (Rumoured)

Released

Mar 29, 2025

Speed

★

Ability

★

Reliability

★

Hugging Face

Supported Parameters

This model supports the following parameters:

Stop Max Tokens Temperature Min P Top P Logprobs Top Logprobs Frequency Penalty Presence Penalty Seed Logit Bias

Performance Summary

DeepSeek V3 Base, a 671B parameter open Mixture-of-Experts (MoE) model, demonstrates exceptional speed and competitive pricing. It consistently ranks among the fastest models and offers highly competitive pricing across all evaluated benchmarks. Created on March 29, 2025, with a substantial context length of 163840, this base model is designed for testing and requires detailed prompts for optimal performance. In terms of benchmark performance, DeepSeek V3 Base exhibits varying capabilities. Its highest accuracy was observed in Coding (24.1%) and General Knowledge (17.6%), placing it in the 18th and 13th percentile respectively. However, performance in Ethics (13.0% accuracy) and Email Classification (10.0% accuracy) was notably lower, ranking in the 11th and 4th percentile. A significant weakness is evident in Instruction Following, where it achieved 0.0% accuracy. While its cost efficiency is generally good, particularly in General Knowledge and Ethics, its duration for tasks like Ethics and Email Classification is quite high, indicating slower processing times despite its overall speed ranking. This suggests that while the model is fast, its accuracy on certain tasks needs improvement.

Model Pricing

Current Pricing

Feature	Price (per 1M tokens)
Prompt	$0.2
Completion	$0.8

Price History

Available Endpoints

Provider	Endpoint Name	Context Length	Pricing (Input)	Pricing (Output)
Chutes	Chutes \| deepseek/deepseek-v3-base	163K	$0.2 / 1M tokens	$0.8 / 1M tokens

Benchmark Results

Benchmark	Category	Reasoning	Strategy	Free	Executions	Accuracy	Cost	Duration

Other Models by deepseek

	Released	Params	Context	Filter by Modalities All Modalities	Speed	Ability	Cost
DeepSeek: DeepSeek V3.2 Speciale	Dec 01, 2025	—	131K	Text input Text output	★	★★★★★	$$$$
DeepSeek: DeepSeek V3.2	Dec 01, 2025	—	131K	Text input Text output	—	—	$$$
DeepSeek: DeepSeek V3.2 Exp	Sep 29, 2025	—	131K	Text input Text output	★★★	★★★★★	$$$
DeepSeek: DeepSeek V3.1 Terminus	Sep 22, 2025	~671B	131K	Text input Text output	★★★★	★★★★★	$$$$
DeepSeek: DeepSeek V3.1 Terminus (exacto)	Sep 22, 2025	~671B	131K	Text input Text output	—	—	$$$
DeepSeek: DeepSeek V3.1	Aug 21, 2025	~671B	131K	Text input Text output	★★	★★★★	$$$
DeepSeek: DeepSeek V3.1 Base Unavailable	Aug 20, 2025	~671B	163K	Text input Text output	★	★	$$
DeepSeek: R1 Distill Qwen 7B Unavailable	May 30, 2025	7B	131K	Text input Text output	★	★	$$$$
DeepSeek: DeepSeek R1 0528 Qwen3 8B	May 29, 2025	8B	131K	Text input Text output	★★★	★★★	$$
DeepSeek: R1 0528	May 28, 2025	~671B	128K	Text input Text output	★★★	★★★	$$$
DeepSeek: DeepSeek Prover V2	Apr 30, 2025	~671B	131K	Text input Text output	★★	★★★★	$$$$
DeepSeek: DeepSeek V3 0324	Mar 24, 2025	~685B	163K	Text input Text output	★★★★	★★★★★	$$
DeepSeek: R1 Distill Llama 8B Unavailable	Feb 07, 2025	8B	32K	Text input Text output	★	★★	$$
DeepSeek: R1 Distill Qwen 1.5B Unavailable	Jan 31, 2025	5B	131K	Text input Text output	★★★	★	$$$
DeepSeek: R1 Distill Qwen 32B	Jan 29, 2025	32B	131K	Text input Text output	★	★★★★	$$$
DeepSeek: R1 Distill Qwen 14B	Jan 29, 2025	14B	32K	Text input Text output	★	★★	$$$
DeepSeek: R1 Distill Llama 70B	Jan 23, 2025	70B	131K	Text input Text output	★★★	★★★★★	$$
DeepSeek: R1	Jan 20, 2025	~671B	128K	Text input Text output	★★★	★★★★	$$$
DeepSeek: DeepSeek V3	Dec 26, 2024	—	163K	Text input Text output	★★★	★★★★	$$$