The world of AI is constantly evolving, and the latest contender to enter the ring is DeepSeek-R1. Released on January 20, 2025, DeepSeek-R1 is making waves with its claimed performance that rivals OpenAI's o1
model. But what exactly is DeepSeek-R1, and what makes it so noteworthy?
DeepSeek, the company behind this new model, has officially released DeepSeek-R1 and, importantly, has open-sourced its model weights. This move is significant for the AI community, fostering collaboration and innovation.
DeepSeek-R1 is released under the MIT License, a permissive license that allows users a great deal of freedom. This means:
This commitment to open source makes DeepSeek-R1 a valuable resource for researchers, developers, and businesses alike. By providing access to the model weights, DeepSeek is encouraging the community to explore, experiment, and build upon its foundation.
DeepSeek-R1 is not just a model; it's also accessible through an API. This allows developers to integrate its capabilities into their applications. A key feature offered through the API is "chain of thought" output, accessible by setting model='deepseek-reasoner'
in the API call. "Chain of thought" reasoning allows the model to break down complex tasks into smaller, more manageable steps, leading to more accurate and reliable results, mirroring the human thought process.
DeepSeek claims that DeepSeek-R1's performance is on par with OpenAI's o1
model, particularly in areas like mathematics, code generation, and natural language reasoning. This claim is supported by benchmark data released alongside the model. The improvements are attributed to the extensive use of reinforcement learning during the post-training phase, enabling significant gains in reasoning abilities even with limited labeled data.
In addition to the DeepSeek-R1 model, DeepSeek has also released six smaller models derived from DeepSeek-R1 through a process called "distillation." Impressively, the 32B and 70B parameter models are reported to outperform OpenAI's o1-mini
model in various capabilities. This is very important as it suggests that DeepSeek-R1's architecture or training methodologies have yielded significant efficiency gains, allowing smaller models to achieve state-of-the-art performance.
The release also included important updates to DeepSeek's licensing and user agreements:
DeepSeek-R1 is also directly accessible through the DeepSeek official website or the DeepSeek App. Users can enable "deep thinking mode" to leverage the latest DeepSeek-R1 model for various reasoning tasks.
The DeepSeek-R1 API has a specific pricing structure designed to balance cost and accessibility. For every million input tokens, the cost is 1 RMB (approximately $0.14 USD) if the request is cached, and 4 RMB (approximately $0.56 USD) if it is not cached. Output tokens are priced at 16 RMB (approximately $2.25 USD) per million.
To delve deeper into DeepSeek-R1, consider exploring these resources:
DeepSeek-R1 represents a significant advancement in the field of AI. Its open-source nature, combined with its competitive performance and accessible API, makes it a compelling option for developers and researchers. Whether it can truly unseat established players like OpenAI remains to be seen, but DeepSeek-R1 has undoubtedly thrown down the gauntlet.