How a top Chinese AI model overcame US sanctions

How DeepSeek Overcame US Sanctions to Rival OpenAI's ChatGPT

The landscape of artificial intelligence is constantly evolving, with new innovations emerging from across the globe. Recently, the AI community has been buzzing about DeepSeek R1, a new open-source reasoning model developed by the Chinese AI startup DeepSeek. This model has demonstrated performance that rivals or surpasses OpenAI's ChatGPT o1 on several key benchmarks, all while operating at a fraction of the cost. What makes DeepSeek's achievement even more remarkable is that they accomplished this despite facing increasing US export controls on cutting-edge chips.

Turning Restrictions into Innovation

Instead of hindering China's AI development, the US sanctions appear to be driving startups like DeepSeek to innovate in ways that prioritize efficiency, resource-pooling, and collaboration.

To create R1, DeepSeek reworked its training process to reduce the strain on its GPUs. According to Zihan Wang, a former DeepSeek employee and current PhD student in computer science at Northwestern University, they used a variety released by Nvidia for the Chinese market that have their performance capped at half the speed of its top products.

Key Features and Capabilities of DeepSeek R1

DeepSeek R1 has been lauded by researchers for its ability to handle complex reasoning tasks, especially in mathematics and coding. Like ChatGPT o1, the model uses a "chain of thought" approach, breaking down problems step by step to arrive at solutions. Dimitris Papailiopoulos, principal researcher at Microsoft’s AI Frontiers research lab, noted that DeepSeek focused on accurate answers rather than detailing every logical step, which significantly reduced computing time while maintaining a high level of effectiveness.

Here are some key features of DeepSeek R1:

Open-source: Making it accessible to researchers and developers worldwide.
High performance: Matching or surpassing ChatGPT o1 on key benchmarks.
Cost-effective: Operating at a fraction of the cost compared to other models.
Efficient: Optimized to reduce strain on GPUs.
Versatile: Capable of tackling complex reasoning tasks in mathematics and coding.

DeepSeek has also released six smaller versions of R1 that can run locally on laptops. The company claims that one of them even outperforms OpenAI’s o1-mini on certain benchmarks. Aravind Srinivas, CEO of Perplexity, tweeted that DeepSeek has largely replicated o1-mini and has open-sourced it.

The Story Behind DeepSeek

Based in Hangzhou, China, DeepSeek was founded in July 2023 by Liang Wenfeng, an alumnus of Zhejiang University with a background in information and electronic engineering. The company was incubated by High-Flyer, a hedge fund that Liang founded in 2015. Like Sam Altman of OpenAI, Liang aims to build artificial general intelligence (AGI), a form of AI that can match or even beat humans on a range of tasks.

Liang's decision to venture into AI was directly related to US export controls on advanced semiconductors. Before the anticipated sanctions, Liang acquired a substantial stockpile of Nvidia A100 chips, a type now banned from export to China. The Chinese media outlet 36Kr estimates that the company has over 10,000 units in stock, but Dylan Patel, founder of the AI research consultancy SemiAnalysis, estimates that it has at least 50,000. Liang recognized the potential of this stockpile for AI training, which led him to establish DeepSeek.

Overcoming Challenges in the Chinese AI Landscape

The Chinese AI space is dominated by tech giants like Alibaba and ByteDance, as well as startups with deep-pocketed investors. This makes it challenging for small or medium-sized enterprises to compete. DeepSeek, which has no plans to actively raise funds, is a rare exception.

Liang Wenfeng noted in an interview with the Chinese media outlet 36Kr in July 2024 that Chinese companies face an additional challenge: their AI engineering techniques tend to be less efficient. He stated that Chinese companies consume twice the computing power to achieve the same results. "Our goal is to continuously close these gaps," he said.

DeepSeek found ways to reduce memory usage and speed up calculation without significantly sacrificing accuracy. According to Zihan Wang, the team embraced hardware challenges as opportunities for innovation. Liang himself remains deeply involved in DeepSeek’s research process, running experiments alongside his team.

The Rise of Open-Source AI in China

Chinese companies are increasingly embracing open-source principles. Alibaba Cloud has released over 100 new open-source AI models, supporting 29 languages and catering to various applications, including coding and mathematics. Startups like Minimax and 01.AI have also open-sourced their models.

According to a white paper released last year by the China Academy of Information and Communications Technology, the number of AI large language models worldwide has reached 1,328, with 36% originating in China. This makes China the second-largest contributor to AI, behind the United States. Thomas Qitong Cao, an assistant professor of technology policy at Tufts University, said that young Chinese researchers identify strongly with open-source culture because they benefit so much from it.

Matt Sheehan, an AI researcher at the Carnegie Endowment for International Peace, suggests that US export controls have forced Chinese companies to be far more efficient with their limited computing resources. The rapid evolution of AI demands agility from Chinese firms to survive. Recently, Alibaba Cloud partnered with the Beijing-based startup 01.AI to merge research teams and establish an "industrial large model laboratory."

The Future of AI Innovation

DeepSeek's journey exemplifies how restrictions can spur innovation and efficiency. By focusing on resource optimization and embracing open-source collaboration, Chinese AI companies are making significant strides in the field. As the AI landscape continues to evolve, it will be interesting to see how these trends shape the future of AI development and deployment worldwide.

This article explores the impressive achievements of DeepSeek. You might also be interested in learning about other related topics, such as the impact of AI on internet search or the future of AI in 2025.

. . .

Imperial Fists - Character Name Generator — Realm of Plastic

Apr 27, 2020 ... The biggest and best Imperial Fists character name generator on the web! Perfect for narrative play in the 41st millennium, whether you're ...

DeepSeek, Huawei, and how the US has struggled to limit Chinese ...

Jan 29, 2025 ... Smartphone maker Huawei serves as a cautionary tale on limiting Chinese technology, as a similar debate reignites with DeepSeek's ...

Email Header Analyzer, RFC822 Parser - MxToolbox

This tool will make email headers human readable by parsing them according to RFC 822. Email headers are present on every email you receive via the Internet and ...

Currency Converter | Foreign Exchange Rates | OANDA

Use our free currency converter. Get accurate and reliable foreign exchange rates, based on OANDA Rates™.

!analyze (WinDbg) - Windows drivers | Microsoft Learn

Oct 25, 2023 ... analyze displays information about the most recent bug check. If a bug check occurs, the !analyze display is automatically generated. You can ...