DeepSeek-V3: A New Frontier in Open-Source AI Model Performance

DeepSeek AI has officially released the first version of its groundbreaking DeepSeek-V3 model series, simultaneously making it open source. This powerful new model is designed to compete with the leading closed-source models on the market, offering exceptional performance across a variety of tasks.

What is DeepSeek-V3?

DeepSeek-V3 is a Mixture-of-Experts (MoE) model with 671 billion parameters, activating 37 billion. It has been pre-trained on 14.8 trillion tokens, showcasing its vast knowledge base and learning capabilities.

Performance Benchmarks: How Does DeepSeek-V3 Stack Up?

DeepSeek-V3 surpasses several open-source models, including Qwen2.5-72B and Llama-3.1-405B, in various benchmark tests. In some cases, it rivals the performance of top-tier closed-source models like GPT-4o and Claude-3.5-Sonnet.

Here’s a detailed look at its performance across different areas:

Encyclopedic Knowledge: Significantly improved over its predecessor, DeepSeek-V2.5, approaching the level of Claude-3.5-Sonnet-1022 in knowledge-based tasks like MMLU, MMLU-Pro, GPQA, and SimpleQA.
Long Text Handling: Excels in long text evaluations such as DROP, FRAMES, and LongBench v2, outperforming other models in average performance.
Coding Prowess: Far exceeds other non-o1 models in algorithmic coding scenarios (Codeforces) and nears Claude-3.5-Sonnet-1022 in engineering code tasks (SWE-Bench Verified).
Mathematical Reasoning: Demonstrates exceptional capabilities in mathematics, significantly outperforming both open-source and closed-source models in competitions like the American Invitational Mathematics Examination (AIME 2024, MATH) and the Chinese National High School Mathematics League (CNMO 2024).
Chinese Language Understanding: Matches the performance of Qwen2.5-72B in educational assessments like C-Eval and pronoun disambiguation tasks, while showing superior performance in factual knowledge (C-SimpleQA).

Enhanced Generation Speed

DeepSeek-V3 offers a significantly improved user experience with a generation speed increase from 20 tokens per second (TPS) to 60 TPS. This threefold increase in speed is attributed to innovations in algorithms and engineering, enabling more rapid and seamless interactions.

API Service Pricing

With the introduction of DeepSeek-V3, DeepSeek AI has adjusted its API service pricing to reflect the enhanced performance and speed. The new pricing structure is as follows:

Input tokens: ¥0.5 per million tokens (cache hit) / ¥2 per million tokens (cache miss)
Output tokens: ¥8 per million tokens

To encourage adoption, DeepSeek AI is offering a special introductory pricing period of 45 days, valid until February 8, 2025. During this period, users, including both new and existing, will benefit from the original pricing:

Input tokens: ¥0.1 per million tokens (cache hit) / ¥1 per million tokens (cache miss)
Output tokens: ¥2 per million tokens

Open Source Availability and Local Deployment

DeepSeek-V3 is trained using FP8 and provides open-source native FP8 weights. The open-source community has quickly embraced DeepSeek-V3, with SGLang and LMDeploy supporting native FP8 inference. Additionally, TensorRT-LLM and MindIE have implemented BF16 inference. DeepSeek AI also provides a conversion script from FP8 to BF16 to further facilitate community adaptation and application.

You can download the model weights and find detailed information about local deployment on Hugging Face.

DeepSeek's Commitment to Open Source

DeepSeek AI remains committed to the spirit of open source and the pursuit of accessible AGI. The release of DeepSeek-V3 underscores this commitment, aiming to narrow the gap between open-source and closed-source model capabilities.

Key Takeaways:

DeepSeek-V3 is a cutting-edge open-source model.
It rivals top closed-source models in performance.
It offers significantly improved generation speed.
It supports open-source and local deployment.

The release of DeepSeek-V3 marks a significant advancement in AI, promising more sophisticated features in the future. DeepSeek AI aims to continue sharing its advancements with the community, fostering innovation and collaboration. Explore more about DeepSeek AI on their GitHub.

. . .

DeepSeek: Is It A Stolen ChatGPT? | by Jan Kammerath | Jan, 2025 ...

Jan 27, 2025 ... What actually is DeepSeek? · It's a distilled, disassembled, reverse-engineered ChatGPT · It's an adapted “copy” of ChatGPT, however they got it ...

DeepSeek: The Chinese AI app that has the world talking

Feb 4, 2025 ... DeepSeek, a Chinese artificial intelligence (AI) startup, made ... "The company's success is seen as a validation of China's Innovation ...

Free MLA Citation Generator - US Standard - 2024 Update - BibGuru

BibGuru's MLA citation generator helps you create the fastest and most accurate MLA citations possible.

Best email header analyzer? : r/sysadmin

May 16, 2024 ... I tested this by forwarding an email to the address on learndmarc and of course it analyzes my email, not the original message. How do I analyze ...

Epoch Converter - Unix Timestamp Converter

The efficient tool on this page will assist you in converting timestamps from seconds (10-digit), milliseconds (13-digit) and microseconds (16-digit) into ...