DeepSeek R1 has recently gained immense popularity, sparking numerous discussions and tutorials on local deployment. This article provides a simplified guide to deploying and using DeepSeek R1 locally, treating it as a digital companion to experiment with. For advanced usage, consider referring to the official research paper and deploying the full version.
Released on January 20, 2025, DeepSeek R1 is DeepSeek AI's first-generation inference model. It excels in complex reasoning tasks and is positioned as a competitor to OpenAI's o1 model. DeepSeek R1 is adept at handling intricate tasks such as mathematical reasoning, code generation, and logical inference.
Key Resources:
DeepSeek R1 comes in multiple versions, including a full version (671B parameters) and distilled versions (1.5B to 70B parameters). While the full version offers superior performance, it demands substantial hardware resources. The distilled versions are more accessible for local deployment on standard hardware.
Full Version (671B Parameters):
Distilled Versions:
Distilled vs. Full Version: Key Differences
Feature | Distilled Version | Full Version |
---|---|---|
Parameter Count | Fewer parameters (1.5B, 7B), near full performance | More parameters (32B, 70B), top performance |
Hardware | Lower VRAM/RAM requirements | Higher VRAM/RAM requirements |
Use Cases | Lightweight tasks, resource-constrained devices | High-precision tasks, professional environments |
Deep Dive into Distilled Model Variants
Model Version | Parameters | Characteristics |
---|---|---|
deepseek-r1:1.5b | 1.5B | Lightweight, suitable for low-end hardware, fast but limited performance |
deepseek-r1:7b | 7B | Balanced performance, good for most tasks, moderate hardware requirements |
deepseek-r1:8b | 8B | Slightly better than 7B, suitable for higher precision tasks |
deepseek-r1:14b | 14B | High-performance, suitable for complex tasks (math, coding), higher hardware demands |
deepseek-r1:32b | 32B | Professional-grade, for research and high-precision tasks, requires advanced hardware |
deepseek-r1:70b | 70B | Top-tier performance, for large-scale computation and complex tasks |
Quantized Versions: Further Optimization
Quantized models offer reduced memory footprints using techniques like 4-bit quantization, further optimizing them for resource-limited environments. Note that quantization can reduce the model accuracy.
Model Version | Parameters | Characteristics |
---|---|---|
deepseek-r1:1.5b-qwen-distill-q4_K_M | 1.5B | Lightweight, fast, limited performance, 4-bit quantization |
deepseek-r1:7b-qwen-distill-q4_K_M | 7B | Balanced, good for most tasks, moderate hardware, 4-bit quantization |
deepseek-r1:8b-llama-distill-q4_K_M | 8B | Slightly better than 7B, suitable for higher precision tasks, 4-bit quantization |
deepseek-r1:14b-qwen-distill-q4_K_M | 14B | High-performance, complex tasks (math, coding), higher hardware, 4-bit quantization |
deepseek-r1:32b-qwen-distill-q4_K_M | 32B | Professional, research, high-precision tasks, advanced hardware, 4-bit quantization |
deepseek-r1:70b-llama-distill-q4_K_M | 70B | Top-tier, large-scale computation, complex tasks, requires professional hardware, 4-bit quantization |
Distilled vs. Quantized: A Quick Comparison
Model Type | Characteristics |
---|---|
Distilled | Fine-tuned for reduced parameters and near-original performance on lower-end hardware |
Quantized | Reduced memory usage via lower precision (e.g., 4-bit quantization) |
Example: deepseek-r1:7b-qwen-distill-q4_K_M
: 7B model, distilled and quantized, reducing VRAM from 5GB to 3GB.
For most local deployments, the distilled versions are sufficient.
Choosing the right model hinges on your hardware. Here's a breakdown:
2.1 System Configuration
Windows:
Linux:
Mac:
Model Selection Guide:
Model Name | Parameters | Size | VRAM (Approx.) | Recommended Mac Config | Recommended Windows/Linux Config |
---|---|---|---|---|---|
deepseek-r1:1.5b | 1.5B | 1.1 GB | ~2 GB | M2/M3 MacBook Air (8GB+ RAM) | GTX 1650 4GB / RX 5500 4GB (16GB+ RAM) |
deepseek-r1:7b | 7B | 4.7 GB | ~5 GB | M2/M3/M4 MacBook Pro (16GB+ RAM) | RTX 3060 8GB / RX 6600 8GB (16GB+ RAM) |
deepseek-r1:8b | 8B | 4.9 GB | ~6 GB | M2/M3/M4 MacBook Pro (16GB+ RAM) | RTX 3060 Ti 8GB / RX 6700 10GB (16GB+ RAM) |
deepseek-r1:14b | 14B | 9.0 GB | ~10 GB | M2/M3/M4 Pro MacBook Pro (32GB+ RAM) | RTX 3080 10GB / RX 6800 16GB (32GB+ RAM) |
deepseek-r1:32b | 32B | 20 GB | ~22 GB | M2 Max/Ultra Mac Studio | RTX 3090 24GB / RX 7900 XTX 24GB (64GB+ RAM) |
deepseek-r1:70b | 70B | 43 GB | ~45 GB | M2 Ultra Mac Studio | A100 40GB / MI250X 128GB (128GB+ RAM) |
deepseek-r1:1.5b-qwen-distill-q4_K_M | 1.5B | 1.1 GB | ~2 GB | M2/M3 MacBook Air (8GB+ RAM) | GTX 1650 4GB / RX 5500 4GB (16GB+ RAM) |
deepseek-r1:7b-qwen-distill-q4_K_M | 7B | 4.7 GB | ~5 GB | M2/M3/M4 MacBook Pro (16GB+ RAM) | RTX 3060 8GB / RX 6600 8GB (16GB+ RAM) |
deepseek-r1:8b-llama-distill-q4_K_M | 8B | 4.9 GB | ~6 GB | M2/M3/M4 MacBook Pro (16GB+ RAM) | RTX 3060 Ti 8GB / RX 6700 10GB (16GB+ RAM) |
deepseek-r1:14b-qwen-distill-q4_K_M | 14B | 9.0 GB | ~10 GB | M2/M3/M4 Pro MacBook Pro (32GB+ RAM) | RTX 3080 10GB / RX 6800 16GB (32GB+ RAM) |
deepseek-r1:32b-qwen-distill-q4_K_M | 32B | 20 GB | ~22 GB | M2 Max/Ultra Mac Studio | RTX 3090 24GB / RX 7900 XTX 24GB (64GB+ RAM) |
deepseek-r1:70b-llama-distill-q4_K_M | 70B | 43 GB | ~45 GB | M2 Ultra Mac Studio | A100 40GB / MI250X 128GB (128GB+ RAM) |
This guide focuses on deploying DeepSeek R1 on a Mac environment:
Environment: M2/M3/M4 MacBook Pro (16GB RAM) Model: deepseek-r1:8b
Local Run Benefits:
While this tutorial will cover MacOS, stay tuned for future updates on Windows/Linux deployments.
3.1 Deployment Tools
Various tools simplify DeepSeek R1 deployment:
ollama run deepseek-r1:7b
to download and run. Refer to the official Ollama documentation for detailed usage.docker run -d --gpus=all -p 11434:11434 --name ollama ollama/ollama
.This guide uses Ollama due to its flexibility and suitability for users seeking data privacy and customization. If you have integrated graphics and want to experiment, consider LM Studio. Please see our article on [getting started with Docker] for more info.
3.2 Installing Ollama
ollama --version
Installing the Model:
Copy the installation command from the Ollama website and run it in your terminal:
ollama run deepseek-r1:8b
This command downloads and initiates the DeepSeek R1 8B model.
Post-installation, monitor your hardware usage. A saturated GPU with moderate CPU and memory usage indicates optimal performance.
Essential Ollama Commands:
ollama list
ollama rm deepseek-r1:8b
Enhance your DeepSeek R1 experience with these visual interfaces:
Open-WebUI: A self-hosted LLM web interface for seamless interaction with local models, built for personal LLM use.
Installation (Docker):
docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main
Access the interface at http://localhost:3000/
.
Important: Disable the OpenAI API option in settings if offline to prevent errors.
Dify: A LLM application development platform that allows rapid creation of LLM applications (RAG, AI agents, etc.). Suitable for building AI SaaS, intelligent customer service, and RAG applications.
After successful startup, access Dify through localhost
.
http://host.docker.internal:11434
if Dify and Ollama are on the same machine and Dify is Dockerized.Note: Dify offers more than just conversations; explore its features for advanced applications.
During local testing, the distilled version of DeepSeek R1 showed some shortcomings in code generation but demonstrated impressive text processing and reasoning capabilities.
For a better experience with the complete DeepSeek model, consider using DeepSeek's official API services, which are competitively priced. When DeepSeek first gained popularity, users occasionally encountered server overload issues. The Deepseek team hopes to resolve these issues in the near future.
DeepSeek's API can be integrated into VS Code using plugins like Continue, or into Open-WebUI.
DeepSeek R1 presents a compelling option for local AI experimentation. While both have unique qualities, DeepSeek shines in certain applications. By leveraging the deployment strategies and tools outlined in this guide, you can easily integrate DeepSeek R1 into your workflow.