The DeepSeek API offers a powerful tool for text generation tasks, including the innovative FIM (Fill-In-the-Middle) Completion API. This article delves into the intricacies of this feature, providing a detailed overview of its capabilities and how to effectively utilize it for your projects. This article focuses on the technical details of using the DeepSeek API's FIM completion endpoint, and assumes some basic familiarity with APIs and JSON
FIM completion is a unique approach to text generation where the model is tasked with filling in a missing section within a given text. Instead of generating text from scratch or continuing a prompt from the end, FIM completion allows you to insert content contextually within an existing piece of writing. This can be incredibly useful for:
The DeepSeek API's FIM Completion feature is currently in Beta. To access it users need to set base_url="https://api.deepseek.com/beta"
. Be sure to check the DeepSeek API documentation for the most up-to-date information.
/completions
EndpointTo use the FIM Completion API, you'll need to send a POST
request to the /completions
endpoint. The request body should be in application/json
format and contain several key parameters. Let's break down these parameters:
Required Parameters:
model
(string): Specifies the ID of the model you want to use. Currently, the possible value is deepseek-chat
.prompt
(string): This is where you provide the initial text with a gap that you want the model to fill. The default value is "Once upon a time,".Optional Parameters:
echo
(boolean, nullable): If set to true
, the API will echo back the prompt in addition to the completion.frequency_penalty
(number, nullable): A value between -2.0 and 2.0. Positive values penalize new tokens based on their frequency in the existing text, reducing repetition. Default is 0.logprobs
(integer, nullable): Include log probabilities on the logprobs
most likely output tokens. Maximum value is 20.max_tokens
(integer, nullable): The maximum number of tokens the model can generate for the completion.presence_penalty
(number, nullable): A value between -2.0 and 2.0. Positive values penalize new tokens based on their presence in the text so far, increasing the likelihood of the model discussing new topics. Default is 0.stop
(object, nullable): Up to 16 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence. Can be a single string or an array of strings. This can be useful to prevent your generations from "running away".stream
(boolean, nullable): If set to true
, the API will stream back partial progress as server-sent events.stream_options
(object, nullable): Options for streaming responses. Only relevant when stream: true
.include_usage
(boolean): If set, an additional chunk will be streamed before the data: [DONE] message. The usage field on this chunk shows the token usage statistics for the entire request, and the choices field will always be an empty array. All other chunks will also include a usage field, but with a null value.suffix
(string, nullable): The text that comes after the completion of the inserted text. This is crucial for FIM, helping the model understand the context surrounding the missing section.temperature
(number, nullable): A value between 0 and 2 that controls the randomness of the output. Higher values (e.g., 0.8) produce more random output, while lower values (e.g., 0.2) produce more focused output. Default is 1. It's generally recommended to adjust either temperature
or top_p
, but not both.top_p
(number, nullable): An alternative to temperature
called nucleus sampling. The model considers the tokens with the top_p
probability mass. A value of 0.1 means only the top 10% of tokens are considered. Default is 1. It's generally recommended to adjust either temperature
or top_p
, but not both.A successful request (200 OK) will return a JSON response containing the generated completion. The response includes:
id
(string): A unique identifier for the completion.choices
(object[]): An array of completion choices. Each choice includes:
finish_reason
(string): The reason the model stopped generating tokens (e.g., "stop", "length", "content_filter", "insufficient_system_resource").index
(integer): The index of the choice.logprobs
(object, nullable): Log probability information (token offsets, log probabilities, tokens, and top log probabilities).text
(string): The generated text.created
(integer): The Unix timestamp of when the completion was created.model
(string): The model used for the completion.object
(string): The object type, which is always "text_completion".usage
(object): Usage statistics for the completion request, including:
completion_tokens
(integer): Number of tokens in the generated completion.prompt_tokens
(integer): Number of tokens in the prompt.total_tokens
(integer): Total number of tokens used in the request.import requests
import json
url = "https://api.deepseek.com/beta/completions" # Ensure you're using the beta base URL
headers = {
"Content-Type": "application/json",
# Replace with your actual API key
"Authorization": "YOUR_API_KEY"
}
data = {
"model": "deepseek-chat",
"prompt": "This is the start of a sentence. <FILL_ME> And this is the end.", # Use <FILL_ME> or a similar placeholder
"suffix": " And this is the end.", # Ensure the suffix matches the end of your prompt
"max_tokens": 50
}
response = requests.post(url, headers=headers, data=json.dumps(data))
if response.status_code == 200:
print(json.dumps(response.json(), indent=2))
else:
print(f"Error: {response.status_code} - {response.text}")
Important Considerations for the code example:
YOUR_API_KEY
with your actual DeepSeek API key. You can obtain this from the DeepSeek Platform.suffix
is crucial. The model needs to know what comes after the part it's filling in to generate contextually relevant text.<FILL_ME>
). While not strictly required by the API, this helps you visually identify where the model is supposed to insert the text.prompt
and suffix
, the better the completion will be.temperature
and top_p
: Adjust these parameters to control the creativity and focus of the generated text.stop
sequences: Define stop sequences to prevent the model from generating overly long or irrelevant text.The DeepSeek API's FIM Completion feature opens up exciting possibilities for text generation. By understanding the API parameters and response format, you can leverage this powerful tool to create innovative applications in various domains. As the feature is in Beta, keep an eye on the DeepSeek API Docs for updates and improvements. And don't forget to consult the API Guides for more examples of how to implement FIM completion with DeepSeek!