What is DeepSeek-R1 and why has it attracted global attention?

DeepSeek-R1 is an open-source reasoning model released in January 2025 by the Chinese AI company DeepSeek. It matched or exceeded OpenAI o1's performance across multiple benchmarks, yet reportedly cost only about $5.57 million to train—far below the hundreds of millions spent on comparable American models. This 'punching above its weight' achievement has raised questions about America's AI lead and challenged the industry consensus that only massive compute investments can produce top-tier models.

How did DeepSeek train a top-tier model under US chip bans?

DeepSeek reportedly used approximately 2,000 NVIDIA H800 GPUs (the export-compliant version restricted for China sales, with lower performance than the H100). Its key innovation lies in algorithmic efficiency rather than hardware brute force: it employs a Mixture of Experts (MoE) architecture, Multi-head Latent Attention (MLA), and other techniques to dramatically reduce computational requirements. This demonstrates that algorithmic innovation can partially substitute for hardware restrictions, fundamentally questioning the effectiveness of US export control policies.

What impact do open-source AI models have on the global AI landscape?

Open-source AI is changing the power structure of the AI industry. Meta's Llama series, France's Mistral, China's DeepSeek and Alibaba Cloud's Qwen form a multipolar open-source ecosystem. Open-source models lower the barrier to AI development, enabling SMEs and developing countries to participate in AI innovation, but also bring security risks—open-source models could be used to create disinformation, develop cyberattack tools, or circumvent safety guardrails.

What is Taiwan's role in the open-source AI ecosystem?

As the global manufacturing hub for advanced AI chips (TSMC manufactures over 90% of advanced AI chips), Taiwan plays a critical 'infrastructure provider' role in the open-source AI ecosystem. Taiwan can develop in three directions: building Traditional Chinese large language models based on Taiwanese language and culture; leveraging semiconductor advantages to develop AI edge inference chips; and serving as a trusted AI infrastructure partner, offering a third path amid US-China technological decoupling.

DeepSeek and the Geopolitics of Open-Source AI: Technological Democratization or Strategic Tool?

On January 20, 2025—the very same day Donald Trump was inaugurated as the 47th President of the United States—a Chinese AI company quietly released a model that would shake global capital markets within days. DeepSeek's DeepSeek-R1 matched or surpassed OpenAI's o1 model on benchmarks in mathematical reasoning, programming, and scientific problem-solving—yet its reported training cost was only about $5.57 million, a fraction of the hundreds of millions invested in comparable American models.^[1] On January 27, US stock markets experienced violent turbulence: NVIDIA lost approximately $593 billion in market capitalization in a single day, the largest single-day loss for any company in US stock market history.^[2] This event, dubbed the "DeepSeek Moment," revealed an unsettling truth: the series of semiconductor export controls that the United States had implemented since October 2022—policies designed to restrict China's access to advanced AI computing power—may not have been as effective as intended in curbing China's AI development. If China can train models rivaling those from America's top labs with less compute and lower costs, then the entire strategic premise of the geopolitical contest surrounding AI chips needs to be re-examined. This article attempts to analyze this structural transformation reshaping the global AI landscape from three dimensions: geopolitical game theory, open-source economics, and technological sovereignty.

I. Deconstructing DeepSeek's Technology: How Algorithmic Innovation "Circumvents" Hardware Limitations

To understand the geopolitical implications of the DeepSeek phenomenon, one must first grasp the nature of its technical achievement. DeepSeek-V3 (the foundation model for R1) is a 671-billion-parameter Mixture of Experts (MoE) model, but each inference activates only 37 billion parameters—meaning its actual computational requirements are far lower than what its nominal parameter count would suggest.^[3]

DeepSeek's core technical innovations center on three areas. First, Multi-head Latent Attention (MLA)—by compressing attention key-value pairs into a low-dimensional latent space, this technique dramatically reduces memory requirements and computational costs during inference. The traditional Multi-Head Attention mechanism is one of the core bottlenecks of the Transformer architecture; MLA uses elegant mathematical techniques to significantly alleviate this bottleneck while maintaining model performance.^[3]

Second, Auxiliary-loss-free Load Balancing. In MoE models, ensuring balanced workload distribution among different experts has long been an engineering challenge. Traditional methods use auxiliary loss functions to penalize imbalanced allocation, but this sacrifices the model's final performance. DeepSeek proposed a dynamic adjustment mechanism based on bias terms that achieves load balancing without compromising model quality.^[3]

Third, and most strategically significant, is the FP8 mixed-precision training framework. Under US export controls, DeepSeek reportedly used NVIDIA H800 GPUs (the export-compliant version for China, with interconnect bandwidth limited to 400 GB/s).^[4] By developing an FP8 (8-bit floating point) mixed-precision training framework, DeepSeek reduced the floating-point precision of training from the industry-standard BF16 (16-bit) to FP8, effectively nearly doubling each GPU's computational capacity—fundamentally altering the calculus of "compute limitations."

Building on V3, DeepSeek-R1 further incorporated large-scale Reinforcement Learning (RL), enabling the model to develop Chain of Thought-like reasoning capabilities—not merely providing answers, but demonstrating its reasoning process.^[1] Most striking was the DeepSeek-R1-Zero experiment—researchers discovered that even without any human-annotated supervised fine-tuning data, through pure reinforcement learning alone, the model could "spontaneously" develop reasoning behaviors. This finding challenges the prevailing assumption in AI research that reasoning capabilities must be "taught" to models through carefully designed training data.

II. The Reality and Myths of the "Cost Shock": A Battle for Narrative Control

DeepSeek's claimed training cost of $5.57 million became the focal figure in global media coverage. However, this number requires careful interpretation.

First, the $5.57 million accounts only for the GPU rental costs of V3's final training phase (based on 2,048 H800 GPUs, 14.8 trillion training tokens, and approximately two months of training time).^[3] It excludes: preliminary research and experimentation costs, data collection and processing costs, personnel salaries, multiple failed training experiments, and the reinforcement learning training costs for the R1 phase. Some analysts estimate that including these indirect costs, the full life-cycle development cost of DeepSeek-R1 may range from $30 million to $100 million—still far below comparable models from OpenAI or Google, but well above the sensationalized "$5.57 million" headline.^[5]

Second, reports indicate that DeepSeek's parent company, High-Flyer Capital Management, may have procured more than 10,000 NVIDIA A100 GPUs before US export controls took effect.^[6] What role these high-end GPUs played in research-phase experimentation remains unknown to outsiders. In short, "$5.57 million" is more a carefully framed marketing figure than a comprehensive cost accounting.

Yet even accounting for these adjustments, DeepSeek's achievement carries structural significance. It proves a critical proposition: improvements in algorithmic efficiency can partially substitute for brute-force hardware scaling. To use an analogy: America's AI strategy resembles controlling an adversary's steel supply in an arms race, hoping to limit their shipbuilding capacity. DeepSeek's response is—we don't need bigger ships; we need better ships.^[7]

The policy implications of this proposition are profound. As analysis from Epoch AI Research Institute points out, over the past two years, the training compute efficiency of top-tier models has improved by approximately 2-3x per year—meaning the same model performance requires only one-half to one-third of the computational resources annually.^[8] If this trend continues, the "window of opportunity" that export controls can provide will continue to shrink.

III. The Geopolitical Economics of Open-Source AI: A Multi-Dimensional Game

DeepSeek's decision to open-source its model weights under the MIT license involves complex game-theoretic considerations.

From an economics perspective, open-sourcing is a "platform strategy"—by providing a free foundation model, it builds ecosystem stickiness and monetizes through derivative markets such as API services and enterprise customization. Meta's Llama series has proven the viability of this model: after Llama 3.1 was released in 2024, it rapidly became the world's most widely used open-source LLM, enabling Meta to establish formidable brand equity and ecosystem control within the AI developer community.^[9]

However, the open-source decisions of Chinese AI companies carry an additional geopolitical dimension. Against the backdrop of escalating US chip export controls, Chinese AI companies face a Prisoner's Dilemma: if they keep models closed-source, they are confined to the domestic market and cannot participate in shaping global AI standards; if they open-source, they can influence the technological trajectory of AI development through widespread global adoption, while further diluting the effectiveness of export controls—because once model weights are public, anyone can perform inference and fine-tuning on any hardware.^[10]

DeepSeek's open-source strategy can be understood as a form of "technological diplomacy"—by providing high-quality free models, it builds global dependency relationships, thereby gaining leverage in the competition for influence over international AI governance. This bears resemblance to the Cold War-era logic of the Soviet Union extending geopolitical influence through technical assistance, but in the digital age, the speed of such influence propagation is exponential.

The current global open-source AI ecosystem has formed a quadripolar structure: America's Meta (Llama series), France's Mistral, China's DeepSeek/Alibaba Cloud Qwen, and community models led by academic institutions worldwide (such as BigScience's BLOOM).^[11] Each pole has distinct motivations for open-sourcing—Meta aims to counter the closed-source monopoly of OpenAI and Google; Mistral represents Europe's technological sovereignty aspirations; DeepSeek combines commercial expansion with geopolitical considerations; and community models embody academia's idealism about AI democratization. This multipolar structure means that open-source AI is no longer a purely technical movement—it has become a new battleground in great power technology competition.

IV. America's Strategic Dilemma: The "Mobius Strip" of Chip Controls

DeepSeek's rise has exposed the structural contradictions of America's AI chip export control policies.

Since the Bureau of Industry and Security (BIS) of the US Department of Commerce issued its first round of advanced computing export controls targeting China in October 2022,^[4] the United States has undertaken multiple rounds of "patching"—after restricting NVIDIA H100 sales to China, China pivoted to the compliant H800 and A800 variants; when BIS further tightened standards, Huawei introduced its domestically developed Ascend 910B chip; and DeepSeek's algorithmic innovations fundamentally challenged the assumption that "restricting compute equals restricting AI capability."

This creates a policy "Mobius strip": controls stimulate indigenous innovation in the controlled party, indigenous innovation undermines the effectiveness of controls, which then prompts the controlling party to further tighten controls—and the cycle repeats.^[12] More notably, each round of tightened controls is accompanied by significant collateral damage—NVIDIA faces financial pressure from losing the Chinese market (which once accounted for 25% of its data center revenue); allies (such as the Netherlands' ASML and Japan's Tokyo Electron) face commercial losses and diplomatic friction from being required to comply with the controls.^[13]

The Trump administration's policy trajectory after taking office in January 2025 has further increased uncertainty. On one hand, Trump revoked the Biden administration's AI executive order, signaling "AI deregulation"; on the other hand, his China policy team leans toward more aggressive technological decoupling.^[14] This combination of "domestic relaxation, external hardline" may produce a paradoxical outcome—American companies accelerate innovation in a more permissive domestic environment, while stricter export controls may further stimulate China's indigenous substitution, ultimately accelerating rather than delaying the global diffusion of AI technology.

V. The Double-Edged Sword of Open-Source AI: The Promise and Risks of Democratization

The proliferation of open-source AI models produces impacts at two levels that must be assessed separately.

On the positive side, open-source AI is substantively lowering the barrier to AI development. According to Hugging Face statistics, by the end of 2025, the platform hosted over 1 million models with more than 5 million monthly active developers.^[15] Open-source models enable SMEs, academic institutions, and even individual developers to build their own AI applications without relying on tech giant APIs. This is particularly important for developing countries—they may be unable to afford OpenAI or Google's enterprise-level pricing, but can build locally tailored applications based on open-source models. In a sense, open-source AI is a force for technological democratization—it challenges the power structure that dictates "only the wealthiest nations and largest corporations can participate in the AI revolution."

On the risk side, once an open-source model is released, it cannot be "recalled"—it can be downloaded, modified, and have its safety guardrails removed by anyone. Meta's internally leaked initial version of Llama had its safety restrictions removed within days of release.^[16] This raises serious security concerns: could open-source reasoning models be used to accelerate bioweapon design, cyberattack tool development, or large-scale disinformation generation? This is not a hypothetical risk—analysis from the RAND Corporation indicates that current open-source LLMs can already provide significant "capability uplift" to individuals with basic knowledge, particularly in chemistry and biology knowledge acquisition.^[17]

The EU AI Act attempts to strike a balance between these two dimensions. Article 2 of the Act provides partial exemptions for "free and open-source" AI models—if a model is released under an open license and not deployed as part of a commercial AI system, it may be exempt from certain compliance obligations. However, open-source models posing "systemic risk" must still comply with relevant requirements.^[18] This compromise reflects a fundamental governance dilemma: open-source is a global public good, but the governance of global public goods requires global coordination mechanisms—and in the current geopolitical climate, such coordination mechanisms are nearly impossible to establish.

VI. Taiwan's Strategic Opportunity: Positioning within the Open-Source Ecosystem

Taiwan's role in the global open-source AI ecosystem must be considered in light of its unique industrial structure.

The first dimension: hardware infrastructure. The proliferation of open-source AI does not reduce demand for advanced chips—in fact, it may increase demand. As more organizations and individuals use open-source models for inference and fine-tuning, global demand for AI inference chips will expand further. TSMC's advanced process nodes (3nm, 2nm) and advanced packaging technologies (CoWoS) remain the most irreplaceable links in the global AI hardware supply chain for the foreseeable future.^[19] DeepSeek's success is not proof that "good chips aren't needed," but rather that "good chips need to be used more intelligently"—this actually reinforces the value of Taiwan's chip manufacturing, because efficiency-oriented training methods demand higher, not lower, chip quality.

The second dimension: the Traditional Chinese AI ecosystem. Currently, mainstream open-source LLMs generally underperform in Traditional Chinese compared to English and Simplified Chinese.^[20] This creates a unique window of opportunity for Taiwan—building on open-source foundation models, performing specialized fine-tuning for Traditional Chinese, Taiwan's legal system, and Taiwanese industry terminology to create vertical models serving the Taiwan market (and global Traditional Chinese users). The National Science and Technology Council's TAIDE project has taken the first step in this direction, but it requires greater industry participation and more sustained investment.

The third dimension: serving as a "trusted third party." As US-China technological decoupling intensifies, many countries and enterprises are reluctant to fully depend on either American or Chinese AI ecosystems. Taiwan—as a mature democracy with strong technological capabilities, reliable intellectual property protection, and the institutional stability that doesn't "change the rules with political winds"—has the potential to become a "trusted third party" in the global AI supply chain. This is not about replacing the position of the US or China, but about providing a safe "third path" between the two.

Of course, this strategic positioning requires institutional support. Taiwan needs to establish institutional frameworks aligned with international standards in AI governance, cross-border data transfer, and intellectual property protection—not merely for compliance, but as the foundation for building international trust.

VII. Conclusion: From the Compute Race to the Intelligence Race

Perhaps the most fundamental insight revealed by the DeepSeek phenomenon is this: the essence of AI competition is shifting from a "compute race" to an "intelligence race"—not who can stack the most GPUs, but who can most intelligently use limited resources. The implications of this shift for the global AI landscape are structural.

For the United States, it means that export controls still hold short-term strategic value but cannot serve as the sole or primary tool for maintaining AI leadership. The real moat lies in the depth of foundational research, the richness of the talent ecosystem, and the openness of the institutional environment.^[21]

For China, DeepSeek's success is a short-term confidence boost, but it also brings new challenges—how to balance open-source commitments with national security considerations; how to sustain momentum in algorithmic innovation leadership; and how to respond to potentially intensified export controls provoked by DeepSeek's success.

For Taiwan, the DeepSeek Moment is a clear reminder: Taiwan's AI strategy cannot remain limited to the positioning of "manufacturing AI chips for the world." As algorithmic efficiency improves rapidly, Taiwan must simultaneously build depth in the software ecosystem (Traditional Chinese models, application layer) and institutional architecture (AI governance, data regulations) to ensure its long-term irreplaceability in the global AI supply chain.

The geopolitics of open-source AI is an ongoing game—it has no final equilibrium, only continuously evolving strategic interactions. In this game, the winner is not the most powerful player, but the most adaptive one.

Prof. Hung-Yi Chen is a globally recognized scholar in AI governance, corporate strategy, and institutional economics. He serves as CEO of Meta Intelligence Ltd and Adjunct Professor at National Tsing Hua University. His research has been published by Cambridge University Press and Palgrave Macmillan, and he has contributed policy research to the World Bank, United Nations, and UK Foreign Office.

References

DeepSeek-AI. (2025). DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning. arXiv:2501.12948
Reuters. (2025). Nvidia loses nearly $600 billion in market value amid DeepSeek shock. reuters.com
DeepSeek-AI. (2024). DeepSeek-V3 Technical Report. arXiv:2412.19437
U.S. Department of Commerce, Bureau of Industry and Security. (2022). Implementation of Additional Export Controls: Certain Advanced Computing and Semiconductor Manufacturing Items. 87 FR 62186. Federal Register
SemiAnalysis. (2025). DeepSeek: The Real Cost Behind the Headlines. semianalysis.com
Financial Times. (2025). DeepSeek: the Chinese AI group that shook the world. ft.com
Ding, J. (2025). DeepSeek and the Limits of Export Controls. ChinaTalk. chinatalk.media
Epoch AI. (2025). Trends in AI Training Compute Efficiency. epochai.org
Meta AI. (2024). Introducing Llama 3.1: Our most capable openly available model. ai.meta.com
Hwang, T. (2025). Open Source AI and National Security. Center for Security and Emerging Technology (CSET). georgetown.edu
Bommasani, R. et al. (2023). On the Opportunities and Risks of Foundation Models. Stanford CRFM. arXiv:2108.07258
Allen, G. C. (2025). China's New AI Capabilities and the Failure of Export Controls. Center for Strategic and International Studies. csis.org
Miller, C. (2022). Chip War: The Fight for the World's Most Critical Technology. New York: Scribner.
The White House. (2025). Executive Order: Removing Barriers to American Leadership in Artificial Intelligence. whitehouse.gov
Hugging Face. (2025). The State of Open Source AI 2025. huggingface.co
Seger, E. et al. (2023). Open-Sourcing Highly Capable Foundation Models. Centre for the Governance of AI. governance.ai
Mouton, C. et al. (2024). The Operational Risks of AI in Large-Scale Biological Attacks. RAND Corporation. rand.org
European Parliament and Council. (2024). Regulation (EU) 2024/1689 — Artificial Intelligence Act, Article 2. eur-lex.europa.eu
Taiwan Semiconductor Manufacturing Company. (2025). 2025 Annual Report. tsmc.com
National Science and Technology Council. (2025). TAIDE: Trustworthy AI Dialogue Engine Project. taide.tw
Amodei, D. (2025). On DeepSeek and Export Controls. Anthropic Blog. anthropic.com

Back to Insights

The DeepSeek Phenomenon and the Geopolitics of Open-Source AI: Technological Democratization or Strategic Tool?

I. Deconstructing DeepSeek's Technology: How Algorithmic Innovation "Circumvents" Hardware Limitations

II. The Reality and Myths of the "Cost Shock": A Battle for Narrative Control

III. The Geopolitical Economics of Open-Source AI: A Multi-Dimensional Game

IV. America's Strategic Dilemma: The "Mobius Strip" of Chip Controls

V. The Double-Edged Sword of Open-Source AI: The Promise and Risks of Democratization

VI. Taiwan's Strategic Opportunity: Positioning within the Open-Source Ecosystem

VII. Conclusion: From the Compute Race to the Intelligence Race

References

Get in Touch

The DeepSeek Phenomenon and the Geopolitics of Open-Source AI: Technological Democratization or Strategic Tool?

I. Deconstructing DeepSeek's Technology: How Algorithmic Innovation "Circumvents" Hardware Limitations

II. The Reality and Myths of the "Cost Shock": A Battle for Narrative Control

III. The Geopolitical Economics of Open-Source AI: A Multi-Dimensional Game

IV. America's Strategic Dilemma: The "Mobius Strip" of Chip Controls

V. The Double-Edged Sword of Open-Source AI: The Promise and Risks of Democratization

VI. Taiwan's Strategic Opportunity: Positioning within the Open-Source Ecosystem

VII. Conclusion: From the Compute Race to the Intelligence Race

References

Related Articles

Semiconductor Geopolitics: Taiwan's Strategy in the Chip War

The Global Contest for Digital Sovereignty: From Data Governance to Technological Autonomy

Taiwan's AI Strategic Positioning and Global Deployment

Get in Touch

Contact Us