In the rapidly evolving field of artificial intelligence, code generation models are becoming increasingly sophisticated. DeepSeek Coder, developed by DeepSeek AI, stands out as a powerful tool designed to automate and enhance the coding process. This article delves into the features, capabilities, and applications of DeepSeek Coder, providing a comprehensive overview for developers and AI enthusiasts alike.
DeepSeek Coder is a series of code language models trained from scratch on a massive dataset of 2 trillion tokens. These tokens consist of 87% code and 13% natural language, covering both English and Chinese. The models are available in various sizes, ranging from 1 billion to 33 billion parameters, making them scalable and adaptable to different user needs.
DeepSeek Coder supports a vast range of programming languages, including:
DeepSeek Coder has been evaluated on several coding benchmarks, demonstrating its superior capabilities:
These results confirm that DeepSeek Coder is a state-of-the-art solution for code generation.
To start using DeepSeek Coder, follow these steps:
Install the required dependencies using pip:
pip install -r requirements.txt
Use the following code snippet for code completion:
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/deepseek-coder-6.7b-base", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("deepseek-ai/deepseek-coder-6.7b-base", trust_remote_code=True, torch_dtype=torch.bfloat16).cuda()
input_text = "#write a quick sort algorithm"
inputs = tokenizer(input_text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_length=128)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/deepseek-coder-6.7b-base", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("deepseek-ai/deepseek-coder-6.7b-base", trust_remote_code=True, torch_dtype=torch.bfloat16).cuda()
input_text = """<|fim begin|>def quick_sort(arr):
if len(arr) <= 1:
return arr
pivot = arr[0]
left = []
right = []
<|fim hole|>
if arr[i] < pivot:
left.append(arr[i])
else:
right.append(arr[i])
return quick_sort(left) + [pivot] + quick_sort(right)<|fim end|>"""
inputs = tokenizer(input_text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_length=128)
print(tokenizer.decode(outputs[0], skip_special_tokens=True)[len(input_text):])
Utilize the chat model for generating code:
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/deepseek-coder-6.7b-instruct", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("deepseek-ai/deepseek-coder-6.7b-instruct", trust_remote_code=True, torch_dtype=torch.bfloat16).cuda()
messages=[ { 'role': 'user', 'content': "write a quick sort algorithm in python."} ]
inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
# tokenizer.eos_token_id is the id of <|EOT|> token
outputs = model.generate(inputs, max_new_tokens=512, do_sample=False, top_k=50, top_p=0.95, num_return_sequences=1, eos_token_id=tokenizer.eos_token_id)
print(tokenizer.decode(outputs[0][len(inputs[0]):], skip_special_tokens=True))
Fine-tuning DeepSeek Coder allows users to adapt the model to specific tasks. Here’s how to fine-tune the model:
Install the necessary packages:
pip install -r finetune/requirements.txt
instruction
and output
fields.Execute the finetune_deepseekcoder.py
script with appropriate parameters.
deepspeed finetune_deepseekcoder.py \
--model_name_or_path $MODEL_PATH \
--data_path $DATA_PATH \
--output_dir $OUTPUT_PATH \
--num_train_epochs 3 \
--model_max_length 1024 \
--per_device_train_batch_size 16 \
--per_device_eval_batch_size 1 \
--gradient_accumulation_steps 4 \
--evaluation_strategy "no" \
--save_strategy "steps" \
--save_steps 100 \
--save_total_limit 100 \
--learning_rate 2e-5 \
--warmup_steps 10 \
--logging_steps 1 \
--lr_scheduler_type "cosine" \
--gradient_checkpointing True \
--report_to "tensorboard" \
--deepspeed configs/ds_config_zero3.json \
--bf16 True
For high-throughput inference, DeepSeek Coder can be used with vLLM.
from vllm import LLM, SamplingParams
tp_size = 4 # Tensor Parallelism
sampling_params = SamplingParams(temperature=0.7, top_p=0.9, max_tokens=100)
model_name = "deepseek-ai/deepseek-coder-6.7b-base"
llm = LLM(model=model_name, trust_remote_code=True, gpu_memory_utilization=0.9, tensor_parallel_size=tp_size)
prompts = [
"If everyone in a country loves one another,",
"The research should also focus on the technologies",
"To determine if the label is correct, we need to"
]
outputs = llm.generate(prompts, sampling_params)
generated_text = [output.outputs[0].text for output in outputs]
print(generated_text)
from transformers import AutoTokenizer
from vllm import LLM, SamplingParams
tp_size = 4 # Tensor Parallelism
sampling_params = SamplingParams(temperature=0.7, top_p=0.9, max_tokens=100)
model_name = "deepseek-ai/deepseek-coder-6.7b-instruct"
tokenizer = AutoTokenizer.from_pretrained(model_name)
llm = LLM(model=model_name, trust_remote_code=True, gpu_memory_utilization=0.9, tensor_parallel_size=tp_size)
messages_list = [
[{"role": "user", "content": "Who are you?"}],
[{"role": "user", "content": "What can you do?"}],
[{"role": "user", "content": "Explain Transformer briefly."}],
]
prompts = [tokenizer.apply_chat_template(messages, add_generation_prompt=True, tokenize=False) for messages in messages_list]
sampling_params.stop = [tokenizer.eos_token]
outputs = llm.generate(prompts, sampling_params)
generated_text = [output.outputs[0].text for output in outputs]
print(generated_text)
DeepSeek Coder represents a significant advancement in AI-driven code generation. Its massive training data, scalable architecture, and superior performance make it a valuable asset for developers. Whether you're looking to automate code completion, fine-tune models for specific tasks, or leverage high-throughput inference, DeepSeek Coder offers a comprehensive solution. As AI continues to reshape the landscape of software development, tools like DeepSeek Coder will undoubtedly play a crucial role in enhancing productivity and innovation.