DeepSeek — AI’s Linus Torvalds Moment

7 min readFeb 22, 2025

The third decade of the twenty-first century has seen two epochal breakthroughs. The first mold-breaking development was the release of ChatGPT, a state-of-the-art large language model (LLM) by OpenAI. The second groundbreaking development is the release of the R1 model by DeepSeek. The former ushered in an era of proprietary LLMs, hindering the democratization of AI, and the latter, hopefully, will reverse this monopolistic trend in AI and compel the industry to embrace an open-source approach —similar to what Linus Torvalds did by introducing the Linux kernel in 1991.

The Proprietary Era of AI

ChatGPT’s release back in late November 2022 was regarded as an artificial intelligence (AI) breakthrough. It had turned the technological landscape upside-down. Overnight, it became the talk of the town, and after just two months of its release, it had reached 100 million active users. This unprecedented growth trajectory propelled it to the status of the fastest-growing consumer application in history. It was regarded as a technological marvel.

Owing to ChatGPT’s unprecedented success, tech giants like Google, Meta, and others, long dominant in the IT field, began to see it as a formidable challenger. On the other hand, venture capitalists — who are always on the lookout for the next cash cow — recognized AI as a highly profitable sector with substantial return on investment (ROI) potential.

Consequently, Big Tech and Silicon Valley jumped on the AI bandwagon and started pouring millions of dollars into startups and cutting-edge initiatives in the AI realm, and the commercialization of AI began. Resultantly, more AI companies — with proprietary LLMs — emerged on the horizon, like Anthropic, Perplexity AI, and Google DeepMind.

As the competition grew with the emergence of new players, companies with millions of dollars at their disposal started brute-forcing LLMs to outperform the competition. That is, they resorted to a hassle-free approach of increasing model parameters, computational power, and dataset size rather than focusing on more efficient architectures or training techniques. So, the size of LLMs expanded at an exponential rate, with some LLMs surpassing the trillion parameters mark. OpenAI was at the forefront of this brute-forcing spree with models like GPT-4 and o1.

This brute-forcing trend not only discouraged innovation but also created significant entry barriers for smaller companies and independent researchers. As a result, the AI landscape was monopolized by a handful of tech giants, and the democratization of AI started to feel like a far-off dream.

In addition to the commercialization, geopolitics also played its fair share in obstructing the global democratization of AI. Since the US government had put strict export control regimes in place to confine the access to high-end AI processing chips, including Nvidia H800, to a handful of allies only. In this way, the USA created a Technological Iron Curtain — akin to Russia’s Digital Iron Curtain and China’s Great Firewall. This move greatly hampered innovation and equitable AI development across the globe.

As a brief digression, it is worth noting that the US shot itself in the foot with this Technological Iron Curtain, as it forced the Chinese researchers to overcome the GPU shortage through innovative algorithmic solutions, and DeepSeek is the inevitable result of this endeavor.

DeepSeek’s Disruptive Entry

The dawn of January 20, 2025, witnessed a seismic reshuffle of the AI sector's dominance with the Chinese company DeepSeek's release of R1 model. It snatched the title of “the most downloaded free app on the Apple App Store” from ChatGPT! Moreover, this low-cost, high-performance model rivals (and in some areas surpasses) ChatGPT’s top-of-the-line o1 model. The following image shows its performance on various benchmarks.

*Figure 1: Performance comparison of DeepSeek R1 with OpenAI o1 across various benchmarks [Source*]

Surprisingly, DeepSeek achieved this high performance not by brute-forcing but through a technique called Mixture of Experts (MoE) architecture and reinforcement learning. Thus, it demonstrated that groundbreaking algorithmic solutions, rather than sheer computational power, are the key behind AI success.

In traditional dense neural networks, all parameters are activated for every input, leading to substantial computational demands. DeepSeek’s MoE architecture, on the other hand, activates only a subset of parameters — referred to as “experts” — for each input. This selective activation allows the model to scale up in size without a corresponding increase in computational load. So, although the R1 model comprises 671 billion parameters, only 37 billion are activated for each token processed. Consequently, the model operates efficiently while minimizing the need for extensive computational resources.

Moreover, the company employs reinforcement learning directly on base models without relying on supervised fine-tuning. This approach allows models to develop advanced reasoning capabilities naturally, thus enhancing performance in tasks such as mathematics and coding.

The training cost difference between R1 and its proprietary counterparts further signifies its trailblazing status. R1 is trained on low-end GPUs and with a price tag of a mere $5.6 million — an expenditure that sharply contrasts with the billions spent by its competitors. The following image makes the training cost difference more clear (For comparison purposes, $1B cost is conservatively estimated for proprietary LLMs).

Figure 2: Training cost comparison of R1 ($6M) vs. proprietary LLMs ($1B, conservatively estimated)

In addition to the aforementioned features, it has yet another feather in its hat. That is, the company made a groundbreaking move by releasing its state-of-the-art model as ‘open-weight’ under MIT License, allowing researchers and the community to study and build upon it. In this way, the company has dealt a major blow to AI monopolists.

Ironically, this democratization of AI was originally the founding vision of OpenAI. Back in late 2015, the company described itself as, “As a non-profit, our aim is to build value for everyone rather than shareholders.” Moreover, it went on to claim that “our patents (if any) will be shared with the world.” Nevertheless, today, it is evident that it is DeepSeek, not OpenAI, that is staying true to this vision.

Anyhow, we are yet to reach the fully open-source AI as DeepSeek did not release its training data; it is, nevertheless, a good step in that direction, and as such, it can be regarded as the Linus Torvalds movement of AI.

DeepSeek is the Linus Torvars movement of the AI industry

DeepSeek R1 represents a watershed moment in AI as it has set a new standard for openness and innovation. The reason being is that the company didn’t just create a high-performance, low-cost AI model to rival ChatGPT — it broke the mold by open-sourcing it under MIT license instead of monetizing it, making cutting-edge AI freely accessible to the world.

This openness will greatly impact tech innovation, much like Linus Torvalds’ decision to release the Linux kernel in 1991, which transformed the industry.

As with today, where aforementioned companies are investing billions to monopolize AI through proprietary LLMs, US tech giants like Microsoft, IBM, and Sun Microsystems were doing exactly the same back in the 1980s. The latter had dominated the tech landscape through closed-source, commercial software and operating systems (OS), making high-performance computing expensive and restricted through proprietary licensing, vendor lock-in, and expensive enterprise support.

Fortunately, Linus Torvalds shattered the monopoly of big tech by releasing Linux and encouraging contributions from developers worldwide. In this way, Linus Torvalds introduced a decentralized, collaborative development model that paved the way for innovation and moved the OS development beyond corporate silos to an open community of developers worldwide.

In a similar fashion, DeepSeek’s decision to fully open-source its R1 model has the potential to reshape the modern technology landscape. Because it is going to move the AI development beyond proprietary control to a worldwide community of AI researchers.

Moreover, this move will not only compel OpenAI and other leading AI companies to lower their costs but may also pressure them to embrace an open-source paradigm. As a starting point, these companies can release their older LLMs. Fortunately, we have already started to witness this change as the OpenAI’s CEO, Sam Altman, in the aftermath of the DeepSeek shock, admitted in a Reddit “Ask Me Anything” thread that “I personally think we have been on the wrong side of history here and need to figure out a different open-source strategy”. He further added that “not everyone at OpenAI shares this view, and it’s also not our current highest priority”.

Furthermore, it will serve as a catalyst for new research, inspiring advancements in model architecture and optimization techniques aimed at reducing LLM size and improving efficiency. This shift will lead to the global democratization of AI.

Conclusion

In short, DeepSeek’s R1 model represents a potential watershed moment in AI, akin to Linus Torvalds’ impact on the software industry. By embracing an open-weight approach, DeepSeek has challenged the contemporary AI monopolists. Hopefully, it will lead to a more collaborative, inclusive, and innovative AI ecosystem.

DeepSeek — AI’s Linus Torvalds Moment

The Proprietary Era of AI

DeepSeek’s Disruptive Entry

DeepSeek is the Linus Torvars movement of the AI industry

Conclusion

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Umair Akram

No responses yet