The landscape of Artificial Intelligence is rapidly evolving, with a significant shift towards edge computing. Microsoft is at the forefront of this transformation, enabling developers to run powerful AI models directly on Copilot+ PCs. This article delves into the exciting possibilities of running distilled DeepSeek R1 models locally, powered by the Windows Copilot Runtime, and how this innovative approach is revolutionizing AI development.
AI is no longer confined to the cloud. Copilot+ PCs, equipped with powerful Neural Processing Units (NPUs), are ushering in a new era of on-device AI processing. This means faster, more efficient AI performance, reduced latency, and enhanced privacy for users.
Key Benefits of On-Device AI:
Microsoft is collaborating with DeepSeek to bring optimized versions of the DeepSeek R1 models directly to Copilot+ PCs. Starting with the Qualcomm Snapdragon X, and soon expanding to Intel Core Ultra 200V and other platforms. The initial release features the DeepSeek-R1-Distill-Qwen-1.5B model, with the 7B and 14B variants to follow. These models are specifically optimized to leverage the capabilities of the NPU, enabling developers to build and deploy AI-powered applications that run efficiently on-device.
Experimenting with DeepSeek R1 models on your Copilot+ PC is straightforward. Simply download the AI Toolkit VS Code extension.
Steps to get started:
The AI Toolkit provides a seamless developer workflow, allowing you to test and prepare models for deployment. You can also try the cloud-hosted source model in Azure Foundry by clicking on the “Try in Playground” button under “DeepSeek R1”.
The distilled Qwen 1.5B model comprises several components, including a tokenizer, embedding layer, context processing model, token iteration model, a language model head, and a de-tokenizer.
Optimization Techniques Overview:
To achieve low memory footprint and fast inference, several key optimizations were implemented:
These optimizations enable the DeepSeek R1 models to deliver performance comparable to larger models, all while maintaining a compact memory footprint.
The NPU-optimized version of the DeepSeek R1 models delivers impressive performance, enabling users to interact with groundbreaking AI models entirely locally. This opens the door to new, innovative PC experiences, empowering developers to create applications that were previously impossible.
Performance Metrics (Feb 3, 2025 Update):
Microsoft's commitment to bringing AI to the edge is transforming the way we interact with technology. By enabling developers to run powerful models like DeepSeek R1 locally on Copilot+ PCs, powered by the Windows Copilot Runtime and ONNX Runtime, Microsoft is empowering a new generation of AI-powered applications and experiences. This groundbreaking capability promises a future where AI is more accessible, efficient, and integrated into our daily lives.