DeepSeek AI has officially released the first version of its groundbreaking DeepSeek-V3 model series, simultaneously making it open source. This powerful new model is designed to compete with the leading closed-source models on the market, offering exceptional performance across a variety of tasks.
DeepSeek-V3 is a Mixture-of-Experts (MoE) model with 671 billion parameters, activating 37 billion. It has been pre-trained on 14.8 trillion tokens, showcasing its vast knowledge base and learning capabilities.
DeepSeek-V3 surpasses several open-source models, including Qwen2.5-72B and Llama-3.1-405B, in various benchmark tests. In some cases, it rivals the performance of top-tier closed-source models like GPT-4o and Claude-3.5-Sonnet.
Here’s a detailed look at its performance across different areas:
DeepSeek-V3 offers a significantly improved user experience with a generation speed increase from 20 tokens per second (TPS) to 60 TPS. This threefold increase in speed is attributed to innovations in algorithms and engineering, enabling more rapid and seamless interactions.
With the introduction of DeepSeek-V3, DeepSeek AI has adjusted its API service pricing to reflect the enhanced performance and speed. The new pricing structure is as follows:
To encourage adoption, DeepSeek AI is offering a special introductory pricing period of 45 days, valid until February 8, 2025. During this period, users, including both new and existing, will benefit from the original pricing:
DeepSeek-V3 is trained using FP8 and provides open-source native FP8 weights. The open-source community has quickly embraced DeepSeek-V3, with SGLang and LMDeploy supporting native FP8 inference. Additionally, TensorRT-LLM and MindIE have implemented BF16 inference. DeepSeek AI also provides a conversion script from FP8 to BF16 to further facilitate community adaptation and application.
You can download the model weights and find detailed information about local deployment on Hugging Face.
DeepSeek AI remains committed to the spirit of open source and the pursuit of accessible AGI. The release of DeepSeek-V3 underscores this commitment, aiming to narrow the gap between open-source and closed-source model capabilities.
The release of DeepSeek-V3 marks a significant advancement in AI, promising more sophisticated features in the future. DeepSeek AI aims to continue sharing its advancements with the community, fostering innovation and collaboration. Explore more about DeepSeek AI on their GitHub.