Hangzhou DeepSeek Artificial Intelligence Basic Technology Research Co., Ltd., better known as DeepSeek, is a rising star in the artificial intelligence (AI) arena. This Chinese company is making waves by developing powerful, open-source large language models (LLMs) that rival those of industry giants like OpenAI and Meta, but at a fraction of the cost.
Founded in July 2023, DeepSeek is backed by the Chinese hedge fund High-Flyer. The seeds of DeepSeek were sown back in 2016 when High-Flyer co-founder Liang Wenfeng, an AI enthusiast, began using deep learning models for stock trading. This eventually led to the creation of an artificial general intelligence (AGI) lab in 2023, which later evolved into the independent company we know as DeepSeek.
DeepSeek has gained significant attention for its ability to train sophisticated LLMs at a much lower cost than its competitors. For example, DeepSeek claims its R1 model was trained for just $6 million, a fraction of the $100 million reported for OpenAI's GPT-4. This remarkable efficiency has sent "shockwaves" through the AI industry, challenging the dominance of established players and contributing to substantial market value drops for companies like Nvidia.
The lower training costs is attributed to the AI sanctions on China. Which restricted access to Nvidia chipsets and forced Chinese firms to look at lower cost options. This breakthrough in reducing expenses while increasing efficiency and maintaining the model's performance in the AI industry sent "shockwaves" through the market. It threatened the dominance of AI leaders like Nvidia and contributed to the largest drop in US stock market history, with Nvidia alone losing $600 billion in market value.
It's important to note that DeepSeek's models are "open weight," meaning they offer less freedom for modification than true open-source software. However, this approach still allows researchers and developers to access and utilize the models, fostering innovation and collaboration.
DeepSeek has released a variety of LLMs, each with its own strengths and applications:
DeepSeek relies on a robust training framework and infrastructure, including:
Currently, DeepSeek is focused on research and development rather than immediate commercialization. By avoiding consumer-facing technology, the company can navigate China's AI regulations more easily. DeepSeek prioritizes hiring talent based on technical abilities and diverse knowledge, rather than extensive work experience. This approach allows them to bring fresh perspectives to the field of Machine Learning.
Interested in learning more about Machine Learning and Deep Learning strategies? Check out this article about [Choosing the Right Machine Learning Algorithm][internal link to article about machine learning algorithms].
Like many AI companies operating in China, DeepSeek faces scrutiny regarding content moderation and potential bias. Some reports indicate that DeepSeek models are subject to content restrictions in accordance with local regulations, particularly concerning sensitive topics like the Tiananmen Square massacre and the political status of Taiwan. While some users have found ways to bypass this censorship, concerns remain about potential biases in the models' responses. Due to this various countries, such as South Korea, Australia, and Taiwan, have banned DeepSeek applications on government-issued devices.
DeepSeek is quickly establishing itself as a major player in the open-source AI world. Its ability to develop high-performing LLMs at a low cost has the potential to democratize access to AI technology and drive innovation across various industries. As DeepSeek continues to evolve and release new models, it will be fascinating to watch its impact on the global AI landscape.
External Resources: