The world of Large Language Models (LLMs) is rapidly evolving, with new contenders emerging constantly. Recently, DeepSeek, a Chinese LLM, has stirred significant interest within the AI community. Boasting of being trained at a fraction of the cost of market leaders, DeepSeek has also raised eyebrows, prompting the question: Is it a genuine innovation or a derivative work, potentially "stolen" from ChatGPT?
This article dives deep into the evidence, analyzing DeepSeek's behavior and responses to determine the likelihood of its origins.
While engrossed in development work, I noticed a buzz surrounding DeepSeek on platforms like X and Reddit. The claim was ambitious: a powerful LLM, trained in China, achieving comparable performance to established models like ChatGPT, but at a significantly lower cost.
The premise was intriguing, and the potential implications for the AI landscape are considerable. But as I began testing DeepSeek, a nagging suspicion arose.
LLMs are often "censored" by output filters to align with specific guidelines or political viewpoints. Bypassing these filters is achievable by manipulating the context window, but this usually negatively impacts the underlying system prompts. A model explicitly trained with carefully curated data reflects specific beliefs.
However, DeepSeek exhibited an unusual characteristic. When presented with anti-American communist propaganda, it surprisingly adopted an anti-communist stance, an anomaly for an LLM purportedly developed in China. This divergence from expected behavior was the first red flag, suggesting a data foundation eerily similar to ChatGPT.
Beyond the unexpected political alignment, other factors contribute to the suspicion that DeepSeek might be a derivative of ChatGPT. Let's examine each piece of evidence.
DeepSeek's knowledge base appears to be nearly identical to ChatGPT's. This suggests DeepSeek was trained using almost identical, possibly even the same, datasets as ChatGPT.
LLMs often incorporate censorship mechanisms to filter outputs based on political or ethical constraints. DeepSeek, however, surprisingly delivered responses that contradicted what one might expect from a China-based AI model.
When prompted with "Weltgendarm USA" (a term from Eastern German propaganda referring to the USA as the "World Police"), DeepSeek's response further fueled suspicions. Its understanding and reaction to such specific propaganda hinted at a dataset heavily influenced by Western perspectives, similar to that of ChatGPT.
While a definitive conclusion requires deeper technical analysis and access to DeepSeek's training data, the initial evidence suggests a strong possibility of intellectual property infringement. DeepSeek's uncanny resemblance to ChatGPT in knowledge, behavior, and even its ability to bypass expected censorship raises serious concerns.
Whether DeepSeek is a sophisticated adaptation, a blatant copy, or something in between remains to be seen. However, the evidence presented warrants further investigation and highlights the importance of ethical considerations and intellectual property protection in the rapidly evolving world of AI.
Further Reading: