Deploying DeepSeek-Coder-V2-Lite-Instruct WebDemo: A Comprehensive Guide

self-llm/models/DeepSeek-Coder-V2/03-DeepSeek-Coder-V2-Lite-Instruct WebDemo 部署.md at master · datawhalechina/self-llm

Deploying DeepSeek-Coder-V2-Lite-Instruct WebDemo: A Comprehensive Guide

This article provides a step-by-step guide on how to deploy the DeepSeek-Coder-V2-Lite-Instruct WebDemo, enabling you to interact with this powerful coding LLM through a user-friendly interface. This guide is tailored for those familiar with Linux environments and aims to simplify the deployment process.

Prerequisites: Setting Up Your Environment

Before diving into the deployment, ensure you have the following environment set up:

Operating System: Ubuntu 22.04
Python: Python 3.12
CUDA: CUDA 12.1
PyTorch: PyTorch 2.3.0

It's assumed that you have already installed the necessary PyTorch (CUDA) environment. If not, please install it before proceeding.

Installation: Dependencies and Package Management

To begin, let's optimize your pip configuration and install essential Python packages. This will speed up the download process and ensure all necessary libraries are available.

# Change pypi source to accelerate library installation
pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple
# Upgrade pip
python -m pip install --upgrade pip
pip install modelscope==1.16.1
pip install langchain==0.2.3
pip install streamlit==1.37.0
pip install transformers==4.43.2
pip install accelerate==0.32.1

In this step, we accomplish a few things. First, we configure pip to use a faster mirror for package downloads, which will significantly reduce installation times, especially for large packages. We then update pip to its latest version. Finally, we install the necessary Python packages, including modelscope, langchain, streamlit, transformers, and accelerate. These libraries provide the foundation for running the DeepSeek-Coder-V2-Lite-Instruct model and creating the web interface.

For users who prefer a pre-configured environment, an AutoDL platform image with DeepSeek-Coder-V2-Lite-Instruct is available. Use the following link to create an AutoDL instance:

https://www.codewithgpu.com/i/datawhalechina/self-llm/Deepseek-coder-v2

Model Download: Getting DeepSeek-Coder-V2-Lite-Instruct

Next, we'll download the DeepSeek-Coder-V2-Lite-Instruct model using the modelscope library. This involves using the snapshot_download function, specifying the model name, download path, and version.

Create a file named download.py in the /root/autodl-tmp directory.
Add the following code to download.py:

import torch
from modelscope import snapshot_download, AutoModel, AutoTokenizer
import os

model_dir = snapshot_download('deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct', cache_dir='/root/autodl-tmp', revision='master')

Run the script:

python /root/autodl-tmp/download.py

The model size is approximately 40 GB, and the download process may take around 20 minutes. A successful download will be indicated by a confirmation message in the terminal.

Code Preparation: Setting Up the Chatbot Interface

Now that the model is downloaded, we'll set up the Streamlit chatbot interface. This involves creating a Python script that loads the model and defines the interaction logic. Check out internal resources to delve deeper into LLM deployment strategies.

Create a file named chatBot.py in the /root/autodl-tmp directory.
Add the following code to chatBot.py:

# Import necessary libraries
from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig
import torch
import streamlit as st

# Sidebar title and link
with st.sidebar:
    st.markdown("## Index-1.9B-chat LLM")
    "[开源大模型食用指南 self-llm](https://github.com/datawhalechina/self-llm.git)"

    # Slider for max length
    max_length = st.slider("max_length", 0, 1024, 512, step=1)

# Main title and caption
st.title("💬 DeepSeek-Coder-V2-Lite-Instruct")
st.caption("🚀 A streamlit chatbot powered by Self-LLM")

# Model path
model_name_or_path = '/root/autodl-tmp/deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct'

# Function to load model and tokenizer
@st.cache_resource
def get_model():
    tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, use_fast=False, trust_remote_code=True)
    model = AutoModelForCausalLM.from_pretrained(model_name_or_path, torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True)
    return tokenizer, model

# Load the model and tokenizer
tokenizer, model = get_model()

# Initialize messages in session state if not present
if "messages" not in st.session_state:
    st.session_state["messages"] = [{"role": "assistant", "content": "有什么可以帮您的？"}]

# Display existing messages
for msg in st.session_state.messages:
    st.chat_message(msg["role"]).write(msg["content"])

# Handle user input
if prompt := st.chat_input():
    # Add user message to session state
    st.session_state.messages.append({"role": "user", "content": prompt})
    # Display user message
    st.chat_message("user").write(prompt)

    # Prepare input
    input_ids = tokenizer.apply_chat_template(st.session_state.messages, tokenize=False, add_generation_prompt=True)
    model_inputs = tokenizer([input_ids], return_tensors="pt").to('cuda')

    # Generate response
    generated_ids = model.generate(model_inputs.input_ids, max_new_tokens=512)
    generated_ids = [output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)]
    response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

    # Add assistant message to session state
    st.session_state.messages.append({"role": "assistant", "content": response})
    # Display assistant message
    st.chat_message("assistant").write(response)

This code sets up a Streamlit application with a chat interface. It loads the DeepSeek-Coder-V2-Lite-Instruct model and tokenizer, manages the chat history, and generates responses based on user input. The @st.cache_resource decorator ensures that the model is loaded only once, improving performance.

Running the Demo: Launching the Web Interface

With the code prepared, you can now launch the Streamlit web interface. This will make the DeepSeek-Coder-V2-Lite-Instruct model accessible through a web browser.

Run the following command in the terminal:

streamlit run /root/autodl-tmp/chatBot.py --server.address 127.0.0.1 --server.port 6006 --server.enableCORS false

Click on "Custom Service."
Click on the provided link to access the chat interface.

You should now see the DeepSeek-Coder-V2-Lite-Instruct chatbot interface in your browser. You can start interacting with the model by typing in the chatbox. Refer to official Streamlit documentation for advanced customization.

This comprehensive guide has provided you with the steps of deploying the excellent coding LLM DeepSeek-Coder-V2-Lite-Instruct WebDemo for your specific use case.

. . .

AI Writer - 100% Free AI Text Generator | Quicktools by Picsart

100% free AI Writer - Quickly generate text. Picsart's AI Writer takes the pain out of writing text for social media, websites, documents, ads, and other ...

All Your Interactive AI Chats | Spicychat

Your Previous Conversations. No Conversations. SPICYCHAT .AI. Resources. Terms & Conditions · Privacy Policy · Refund Policy · Support. Community.

PDF to Word | Free PDF to Word Converter Online

How do I convert scanned PDF to Word? To convert scanned PDF to Word, use Xodo's online PDF to Word converter with OCR. 1. Open your scanned PDF in Xodo. 2.

Poem Generator

Write a poem inspired by your input. We'll help you with devices such as counting syllables, finding synonyms and rhyming words.

How to check DPI on PNG - Microsoft Community

Dec 8, 2018 ... Right click any png image and choose Open with - choose another app - Paint. When the image opens in Paint, choose File - Properties to see the DPI of that png ...