DeepSeek AI has officially released the first version of its new series model, DeepSeek-V3, and simultaneously open-sourced it to the AI community. This launch marks a significant milestone in the pursuit of accessible and powerful artificial intelligence.
The DeepSeek-V3 model is now available for interaction on the official website, chat.deepseek.com. The API service has also been updated, requiring no configuration changes for existing users. It's worth noting that the current version does not support multimodal input and output.
DeepSeek-V3 is a proprietary Mixture of Experts (MoE) model, boasting 671 billion parameters with 37 billion activated during use. It has been pre-trained on a massive 14.8 trillion tokens.
For detailed technical specifications, you can refer to the research paper available on GitHub.
DeepSeek-V3 has demonstrated impressive performance across various benchmarks, surpassing other open-source models like Qwen2.5-72B and Llama-3.1-405B. In several key areas, it rivals the performance of leading closed-source models such as GPT-4o and Claude-3.5-Sonnet.
Through innovative algorithms and engineering optimizations, DeepSeek-V3 achieves a generation speed of 60 tokens per second (TPS), a threefold increase compared to the V2.5 model.
With the introduction of the more powerful and faster DeepSeek-V3, the model API service pricing has been adjusted to:
However, DeepSeek AI is offering a 45-day promotional period where the API service pricing will remain at the previous rates:
This special pricing is available to both existing and new users who register before February 8, 2025.
DeepSeek-V3 is trained using FP8 and open-sources the native FP8 weights. The open-source community has already provided support for V3 model's native FP8 inference through SGLang and LMDeploy, while TensorRT-LLM and MindIE have implemented BF16 inference. DeepSeek AI also provides a conversion script from FP8 to BF16.
Model weights and deployment information can be found on Hugging Face. See how DeepSeek compares to other models on the Open Model Initiative Leaderboard.
DeepSeek AI's commitment to open-source principles continues with the release of DeepSeek-V3, believing that sharing its advancements with the community will help diminish the capability gap between open and closed-source models.
This is just the beginning, DeepSeek AI plans to enhance the DeepSeek-V3 base model with features like reasoning capabilities and multimodal support.
You might find these topics interesting too: