DeepSeek has announced the release of DeepSeek-V3, marking a significant advancement in their AI model offerings. This latest iteration promises enhanced capabilities while upholding the company's commitment to open-source principles. Let's delve into the key features and improvements of DeepSeek-V3.
DeepSeek-V3 boasts a remarkable 60 tokens per second processing speed, tripling the performance of its predecessor, V2. Simultaneously, it offers enhanced AI capabilities across various tasks. Importantly, DeepSeek ensures API compatibility is maintained, allowing a seamless transition for existing users. The models and research papers are fully open-source, fostering transparency and community collaboration.
Key highlights include:
The architecture comprises 671 billion Mixture of Experts (MoE) parameters. Of these, 37 billion parameters are activated during each inference, contributing to an efficient and powerful model. The model was trained on 14.8 trillion high-quality tokens, ensuring a broad understanding of diverse datasets.
For those interested in thoroughly examining the model and its architecture, these resources are available:
DeepSeek offers competitive pricing for their API. Until February 8th, the pricing remains consistent with V2. Post February 8th, the pricing structure is as follows:
DeepSeek emphasizes that their pricing provides the best value in the market, balancing cost-effectiveness with high performance.
DeepSeek's commitment to an open-source spirit and long-term vision aims to foster inclusive Artificial General Intelligence (AGI). By sharing their progress with the community, they hope to bridge the gap between open and closed AI models. The future holds exciting developments, including multimodal support and other state-of-the-art features within the DeepSeek ecosystem.
For the lasted update, please refer to DeepSeek API Docs