Why DeepSeek R1 is blow to America

On January 20, 2025, Trump announced Operation Stargate. Five hundred billion dollars was to be invested over a few years to build a gigantic data center aimed at developing artificial general intelligence. It was promised that such an AI would cure cancer and bring abundance. As a percentage of the USA’s GDP, the investment was equivalent to the Manhattan Project that built the nuclear bomb.


Curiously, the data center was proposed to be owned by OpenAI and its investors—Microsoft, Oracle, and Softbank. These private companies were to invest money and recoup their investments by selling model inference services to individuals and companies.

These companies draw confidence from the performance of the OpenAI O model series. In September 2024, OpenAI launched the O1 model, which can solve PhD-level problems in mathematics, reasoning, and coding.


The O1 model was a massive improvement over the then state-of-the-art models. According to benchmarks, it performed better than a PhD holder in mathematics, reasoning, coding, and science. But more important is how it worked. OpenAI trained this model to think longer before giving a reply. Furthermore, each iteration of the model learned from the outputs of its previous version, allowing reinforcement learning to be incorporated into the training process. In this context, reinforcement learning refers to a training procedure where the model iteratively learns from the output of its previous versions.

Reinforcement learning was also used by Google to create AlphaGo. This technique allows a model to learn from higher-quality data than what is available in the real world. As a result, AlphaGo was able to iteratively improve and surpass the capabilities of the best human players in the game of Go.

Next, in December, OpenAI released benchmarks of its O3 model—the next version in the model series. While the O1 model was comparable to an average PhD holder, the O3 benchmarks placed it close to the best humans. For example, in competitive programming, the O3 score was nearly equivalent to that of someone ranked 300 in the world. O3 is the result of applying reinforcement learning techniques to improve upon previous models.





Currently, only the O1 and O3 mini models are accessible. OpenAI charges $60 per million output tokens for the O1 model, which is substantially more costly than other large language models.



But just a few days ago, Deepseek—a Chinese company—released an open-source reasoning model called DeepSeek R1. It is very close in performance to the O1 model but costs 30 times less. They have released the model weights for free. Early tests have found its performance to be as good as the claimed benchmarks.


They have also released smaller models that outperform all non-reasoning models in mathematics and programming. Along with these models, Deepseek published a technical report detailing how they trained the models at 50 times lower cost than existing state-of-the-art models. This breakthrough allows them to offer their model at a price 30 times lower than that of the O1.

This development has put the entire Silicon Valley in disarray. If one of the best large language models is freely available to users, how can anyone justify raising $500 billion for the creation of AGI?

All of this is incredible, especially since the USA had banned the sale of top GPUs to China in October 2022. The Deepseek team claims to have trained their model on 2000 H800 GPUs. These GPUs were intentionally limited (“nerfed”) to allow their sale in mainland China. However, the Deepseek team asserts that they wrote GPU assembly language code to remove the bandwidth restrictions on these GPUs.

Beyond hacking these GPUs, the Deepseek team has made several technical innovations—the details of which are provided in their technical report. However, most of these claims remain to be verified. Their most important claim is the success of the AlphaZero technique in training large language models.

In 2016, Google used a reinforcement learning technique to beat the top Go player in the world. Their AlphaGo program was first trained on human Go games and then further refined on self-play games. Six months later, the Google team achieved a breakthrough: when they trained their model solely through reinforcement learning on self-play games—without using human game data—the new model was able to win 100 out of 100 games against the previous version, which had been trained on a mix of human and self-play data.




Deepseek claims to have trained its model initially solely through self-reinforcement learning. They released a version called DeepSeek R1-zero, trained purely through reinforcement learning. However, the AlphaZero technique did not work perfectly—R1-zero was prone to repetition and hallucination. Consequently, they fine-tuned R1-zero using both human and synthetic data, which resulted in the DeepSeek R1 model. While it trails just behind the OpenAI O1 model in mathematics and coding, it excels in creative writing compared to the O1. This, along with its low cost, has allowed DeepSeek R1 to become an overnight sensation worldwide.

Even some of the best coders in the world are now using DeepSeek R1 for their work. For example, Georgi—one of the best coders in the world—has incorporated DeepSeek R1 into his projects.


On January 31, 2025, OpenAI released the O3 mini model to the public. The O3-mini medium is available only under certain usage limits for free and plus-tier users, and it is only comparable to DeepSeek R1’s capabilities. For example, its coding abilities are evident from the code editing leaderboard for large language models.



OpenAI may vault ahead again, but what DeepSeek R1 has demonstrated is that open-source large language models are here to stay. DeepSeek R1 could pave the way for further reinforcement learning innovations and rapid progress in the open-source LLM arena.



Having outsourced much of its manufacturing to China, America has long depended on its dominance in the technology sector. However, bit by bit, China is making heavy inroads everywhere. Consider electric vehicles: in just four years, China has become the world’s leading exporter of vehicles, primarily by leveraging electric vehicle technology copied from Tesla.





 This is causing a sharp decline in Tesla's financials 



But this is mild compared to the situation in Europe


China is the largest global market and producer of nearly all goods. To gain access to Chinese markets and lower costs, Tesla set up factories in China. Chinese companies quickly copied Tesla’s technology, a pattern that is repeated in every industry. It is even rumored that Deepseek trained on billions of dollars’ worth of smuggled GPUs. Frankly, it is hard to imagine that GPUs could be smuggled into China without the help of the Chinese government.

The USSR was once criticized for a lack of incentives that stalled innovation and progress. In contrast, China has combined the best elements of both systems. China provides strong government support to its companies while its private sector pays competitive salaries. Chinese students also comprise a major portion of science and technology students in America. In fact, a significant number of OpenAI employees are Chinese.

America’s influence was built on its leadership in technology and the economy. In the past, being allied with America meant gaining access to the latest technology, foreign investments, and cheaper goods. Today, however, the Chinese are the largest trading partners for nearly every country in the world—except for those in North America.



If AGI is developed by Chinese companies, they will rapidly catch up in all remaining technological fields, leaving no competitive advantage for America and its allies.




Comments

Popular posts from this blog

Your Inconsistency is a Feature, Not a Bug

LLMs have already taken over the world. Dont be left behind.

Common theme across important essays