DeepSeek has officially launched DeepSeek-V2.5, a next-generation open-source model that seamlessly blends general conversational capabilities with robust code processing power. This innovative model is designed to offer a more streamlined, intelligent, and efficient user experience for a wide range of applications, marking a significant leap forward in the field of AI.
DeepSeek-V2.5 represents a fusion of DeepSeek-V2-0628 and DeepSeek-Coder-V2-0724, combining the strengths of both models. This powerful combination results in a model that not only retains the general conversational abilities of the Chat model but also maintains the robust code processing capabilities of the Coder model. Furthermore, DeepSeek-V2.5 is meticulously aligned with human preferences, making it more intuitive and user-friendly.
The model is available on both the web and API, featuring backward-compatible API endpoints accessible through deepseek-coder
or deepseek-chat
. Key features such as Function Calling, FIM (Fill-In-the-Middle) completion, and JSON output remain unchanged, ensuring a smooth transition for existing users.
DeepSeek's commitment to model refinement has been a driving force behind the development of DeepSeek-V2.5. Here's a look at its evolution:
June Upgrade: DeepSeek-V2-Chat's base model was replaced with Coder-V2-base, greatly enhancing its code generation and reasoning capabilities. This upgrade led to the release of DeepSeek-V2-Chat-0628.
Coder Model Launch: Shortly after, DeepSeek-Coder-V2-0724 was launched with improved general capabilities through alignment optimization.
Model Fusion: Ultimately, the Chat and Coder models were successfully merged to create the new, unified DeepSeek-V2.5, offering the best of both worlds.
It's recommended to adjust system prompts and temperature settings for optimal results due to the significant updates in this version. Understanding and tweaking the temperature parameter can help in achieving desired outputs.
DeepSeek-V2.5 has been rigorously evaluated using industry-standard test sets, consistently outperforming both DeepSeek-V2-0628 and DeepSeek-Coder-V2-0724 on most benchmarks.
Internal Chinese evaluations reveal significant improvements in win rates against GPT-4o mini and ChatGPT-4o-latest, particularly in tasks like content creation and Q&A, enhancing overall user satisfaction.
DeepSeek has prioritized safety and helpfulness throughout the development process. DeepSeek-V2.5 features clearly defined safety boundaries, improving resistance to jailbreak attacks while reducing overgeneralization of safety policies to normal queries.
Model | Overall Safety Score (higher is better) | Safety Spillover Rate (lower is better) |
---|---|---|
DeepSeek-V2-0628 | 74.4% | 11.3% |
DeepSeek-V2.5 | 82.6% | 4.6% |
DeepSeek-V2.5 retains the robust code capabilities of DeepSeek-Coder-V2-0724, demonstrating notable improvements in the HumanEval Python and LiveCodeBench tests. While DeepSeek-Coder-V2-0724 slightly outperformed in HumanEval Multilingual and Aider tests, both versions showed room for improvement in the SWE-verified test.
Additionally, the DS-FIM-Eval internal test set showcased a 5.1% improvement in the FIM completion task, enhancing the plugin completion experience. DeepSeek-V2.5 has also been optimized for common coding scenarios to improve user experience. In the DS-Arena-Code internal subjective evaluation, DeepSeek-V2.5 achieved a significant win rate increase against competitors.
DeepSeek-V2.5 is now available as an open-source model on Hugging Face. You can explore and download it here. Make sure you understand token usage to effectively utilize the model.
DeepSeek-V2.5 marks a significant advancement in AI technology, seamlessly integrating general conversational capabilities with powerful coding functionalities. Its open-source availability promotes collaboration and innovation within the AI community. This model promises a more versatile and efficient user experience, paving the way for new possibilities in both general AI applications and specialized coding tasks. Stay updated with the latest news on DeepSeek and explore related models like DeepSeek-R1 for supercharged reasoning capabilities.