Hangzhou DeepSeek Artificial Intelligence Basic Technology Research Co., Ltd., commonly known as DeepSeek, is a rising force in the artificial intelligence landscape. Founded in July 2023 and based in Hangzhou, Zhejiang, China, DeepSeek focuses on developing open-source large language models (LLMs). Backed by the Chinese hedge fund High-Flyer, DeepSeek has quickly gained recognition for its cost-effective AI solutions that rival those of industry giants like OpenAI and Meta.
The story of DeepSeek begins with High-Flyer, co-founded in February 2016 by Liang Wenfeng, an AI enthusiast with a background in trading. High-Flyer initially focused on stock trading, leveraging GPU-dependent deep learning models. By 2019, the company became a hedge fund dedicated to AI trading algorithms, heavily relying on Nvidia chips.
Recognizing the potential of AI beyond finance, Liang established an artificial general intelligence (AGI) lab in April 2023, separate from High-Flyer's core business. This lab officially became DeepSeek in July 2023, with High-Flyer as its primary investor. Despite reluctance from venture capital firms, DeepSeek forged ahead, driven by a vision to create accessible and powerful AI models.
DeepSeek rapidly introduced a series of models:
This rapid development cycle highlights DeepSeek's commitment to innovation and its ability to quickly adapt to the evolving AI landscape.
DeepSeek operates from Hangzhou, Zhejiang, with funding and ownership primarily held by High-Flyer. Liang Wenfeng serves as the CEO of both companies, maintaining a significant stake in DeepSeek through shell corporations.
The company's strategy centers on research and development. Unlike some competitors, DeepSeek has not publicly emphasized near-term paths to commercialization. This may allow it to avoid stringent AI regulations within China. DeepSeek's hiring practices prioritize technical abilities and diverse knowledge backgrounds, seeking talent from top Chinese universities and fields outside computer science.
DeepSeek utilizes powerful computing clusters, including Fire-Flyer and Fire-Flyer 2, to train its models. Fire-Flyer 2 features a co-designed software and hardware architecture:
The software side includes:
These technologies enable efficient training and optimization of DeepSeek's large language models.
DeepSeek offers a diverse range of LLMs, each with unique characteristics and capabilities. The company's models are generally "open weight," which allows more access than closed proprietary systems, but less freedom than truly open source systems.. Here's an overview:
DeepSeek's emergence has significantly impacted the AI industry, challenging established players and driving down costs. Its ability to achieve comparable performance to larger models with significantly lower training costs has been described as "upending AI."
Domestically, DeepSeek is dubbed the "Pinduoduo of AI," sparking a price war among Chinese tech giants. Despite its low prices, DeepSeek has maintained profitability, unlike some of its competitors.
Overseas, DeepSeek's development amid US sanctions highlights its resilience and ability to innovate despite restrictions on access to advanced chips.
DeepSeek faces scrutiny regarding content moderation, with reports indicating adherence to local regulations limiting responses on sensitive topics. Uncensored models have shown bias towards Chinese government viewpoints.
It has been banned on goverment devices in South Korea,Australia and Taiwan.
DeepSeek's rapid rise as a developer of open-source large language models marks an important new stage in the evolution and accessibility of AI. Fueled by an ethos of innovation and technical excellence, DeepSeek is well-positioned to continue pushing the boundaries of what's possible in artificial intelligence.