Caixin

In-Depth: What DeepSeek Achieved (AI Translation)

Published: Feb. 8, 2025  3:14 p.m.  GMT+8,  Updated: Feb. 8, 2025  3:14 p.m.
00:00
00:00/00:00
Listen to this article 1x
This article was translated from Chinese using AI. The translation may contain inaccuracies. Click the button on the right to hide or reveal the original version.
picture
picture

文|财新周刊 刘沛林 屈运栩

By Caixin Weekly‘s Liu Peilin and Qu Yunxu

  一家中国人工智能(AI)公司横空出世,引发美国资本市场剧震,金融机构争相大幅抛售芯片公司股票,让全球最大上市公司——英伟达的市值一夜之间缩水近五分之一,史所罕见。

A Chinese artificial intelligence (AI) company burst onto the scene, causing a seismic shock on the U.S. capital markets. Financial institutions competed to offload shares in chip companies, leading to a nearly one-fifth reduction in market value for Nvidia, the world's largest publicly traded company, in a single night, marking a rare occurrence.

  总部位于中国杭州的私募机构幻方量化旗下大模型公司DeepSeek(深度求索),于2024年12月26日发布新一代开源预训练大模型DeepSeek-V3,并在2025年1月15日推出面向普通用户的App,随后又在1月20日发布开源模型R1,对标OpenAI最先进的推理模型o1。

DeepSeek, a large model company under the private equity firm xiangfang Quantitative based in Hangzhou, China, released its next-generation open-source pre-trained model, DeepSeek-V3, on December 26, 2024. The company launched an app for general users on January 15, 2025, followed by the release of the open-source model R1 on January 20, aimed at rivaling OpenAI's most advanced inference model, o1.

  开发者惊奇于DeepSeek在技术报告中展示出的创新,以高超算法实现了和OpenAI最先进模型一较高下的能力;普通用户也纷纷“倒戈”——若要使用ChatGPT最新模型需要交付20美元的月费,而DeepSeek免费。

Developers are astounded by the innovation displayed by DeepSeek in its technical report, showcasing a capability to compete with OpenAI's most advanced models through superior algorithms. Meanwhile, ordinary users are also switching allegiances—the latest model of ChatGPT requires a $20 monthly fee, whereas DeepSeek is free.

loadingImg
You've accessed an article available only to subscribers
VIEW OPTIONS
Disclaimer
Caixin is acclaimed for its high-quality, investigative journalism. This section offers you a glimpse into Caixin’s flagship Chinese-language magazine, Caixin Weekly, via AI translation. The English translation may contain inaccuracies.
Share this article
Open WeChat and scan the QR code
DIGEST HUB
Digest Hub Back
In-Depth: What DeepSeek Achieved (AI Translation)
Explore the story in 30 seconds
  • DeepSeek, a Chinese AI company, launched its pre-trained model DeepSeek-V3, challenging giants like OpenAI, and causing significant upheaval in global markets, particularly affecting Nvidia's stock.
  • The company's innovative model outperformed expectations by using far fewer NVIDIA GPUs compared to US counterparts, igniting debates over efficiency and the sustainability of AI investment strategies.
  • DeepSeek's rise led to widespread app interest, government scrutiny, and discussions about the future of open-source AI; it also provoked responses from global tech companies and policymakers on AI competitiveness.
AI generated, for reference only
Explore the story in 3 minutes

A transformative event in the AI industry emerged when DeepSeek, a Chinese AI company associated with Hangzhou's xiangfang Quantitative, introduced its latest model, DeepSeek-V3, on December 26, 2024. This development caused a significant impact on the U.S. capital markets, leading to a dramatic one-night drop in Nvidia's market value by nearly one-fifth. DeepSeek's impressive technological advancements in its models, DeepSeek-V3 and R1, have positioned it alongside leading competitors like OpenAI's models, prompting a widespread reaction from global financial and technological sectors [para. 1][para. 3].

DeepSeek quickly gained popularity in the U.S., outperforming established apps like ChatGPT on the Apple Store. This surge was fueled by the free availability of DeepSeek's app against ChatGPT's monthly subscription fee. By January 2025, visits to DeepSeek's website skyrocketed to 278 million, marking a significant leap, though still lagging behind ChatGPT's 3.8 billion. The rise of DeepSeek during the Chinese Year of the Snake's Spring Festival attracted notable comparisons to ChatGPT's earlier breakout [para. 2][para. 3][para. 5].

DeepSeek-V3 emerged as a potent alternative to models like GPT-4o and o1, achieving similar capabilities with significantly lower resource expenditures. The technical paper for DeepSeek-V3 highlighted a remarkable cost-benefit ratio, utilizing only 2,048 GPUs compared to the tens of thousands employed by OpenAI, resulting in an approximate training cost of $5.576 million. This efficiency has sparked discussions about the inefficiencies of existing AI models and has positioned DeepSeek as a more scalable and resource-efficient option [para. 6][para. 7].

Significant figures in the tech industry, including Microsoft CEO Satya Nadella and OpenAI's Sam Altman, have acknowledged DeepSeek's influence. While public discourse previously favored U.S. companies for their substantial capital investments and technological leadership, the emergence of DeepSeek challenges this dynamic. NVIDIA and OpenAI, traditionally seen as market leaders, faced repercussions with NVIDIA's share price dropping 16.86%, reflecting the largest single-day market value decline in its history. This event created a ripple effect among U.S. chip stocks, highlighting the impact of DeepSeek on the global stage [para. 9][para. 11][para. 13].

The open-source nature of DeepSeek models, compared with OpenAI's closed-source approach, broadens access to cutting-edge technology and reduces barriers to entry in the AI industry. This shifts the landscape by demonstrating that major technological advancements do not necessarily require massive investments in computational power, as previously believed. Major tech companies are reassessing their strategies in response to DeepSeek's cost-effectiveness, highlighting the evolving paradigm in AI technology deployment [para. 14][para. 15].

DeepSeek's breakthrough is reshaping the AI industry by exemplifying the viability of algorithmic and engineering innovations in establishing competitive global technology. The rise of DeepSeek amidst geopolitical tensions showcases China's capability to leverage homegrown research and development despite external pressures. As the AI sector continues to grapple with these disruptions, more companies globally, including those in the U.S., are starting to integrate DeepSeek's models and leveraging the open-source approach to advance their offerings [para. 17][para. 19][para. 21].

AI generated, for reference only
Who’s Who
Phantom Quant
幻方量化
Phantom Quant is a private Chinese firm known for applying AI and computer technology to quantitative trading. Founded by Liang Wenfeng in 2015, it shifted its focus from financial trading to AI, accumulating significant GPU resources. In 2023, its AI team became independent, forming the company DeepSeek. Despite not engaging in external financing or competing for computational power, it managed to develop advanced AI models, including the open-source DeepSeek-V3 and R1 models.
DeepSeek
DeepSeek
DeepSeek is a Chinese AI company that gained global attention for its open-source, cost-efficient AI models. It challenged industry leaders like OpenAI by launching DeepSeek-V3 and R1, comparable to GPT-4o and o1, respectively. DeepSeek's models are noted for their low GPU requirements and high efficiency. The company's success led to significant impacts on the AI industry and stock markets, particularly affecting Western companies like Nvidia.
NVIDIA
英伟达
NVIDIA experienced a significant market impact due to DeepSeek's AI advancements, with its stock price dropping 16.86% and losing $589 billion in market value. This decline is attributed to the financial market's realization of DeepSeek’s ability to train advanced models at lower GPU costs, challenging the existing AI infrastructure's dependency on NVIDIA's chips. NVIDIA acknowledged DeepSeek’s achievements but emphasized the ongoing need for its GPUs during the inference process.
Tesla
特斯拉
The article does not mention Tesla or provide any information about the company. It focuses on the impact of the Chinese AI company DeepSeek on the global AI industry and financial markets, particularly its rivalry with companies like OpenAI and the resulting effects on U.S. chip manufacturers.
Microsoft
微软
Microsoft's CEO Satya Nadella praised DeepSeek during the Davos Forum, recognizing China's AI industry development as significant. Despite Microsoft's close relationship with OpenAI, the company quickly announced plans to integrate DeepSeek R1 into its service offerings, aiming to optimize end-side AI assistants. Additionally, Microsoft will maintain balanced investments in model training and inference, expecting continued growth in capital expenditure, influenced by DeepSeek's cost-effective model advancements.
Google
谷歌
The article mentions that Google's capital expenditure is expected to be $75 billion in 2025, growing over 40% from $52.5 billion in 2024. Despite a significant rise in capital expenditure, there is ongoing debate over the justification of these investments in AI infrastructure, especially after DeepSeek's AI developments. Following an earnings report, Google's stock fell 6.94% on February 5, indicating investor concerns about these investments.
Apple
苹果
The article mentions that DeepSeek's app quickly rose to the top of the free application chart on the Apple Store in the United States, surpassing popular apps such as ChatGPT, Temu, Paramount, and Shein.
Amazon
亚马逊
Amazon announced on January 30 that it had deployed the DeepSeek R1 model on its AI model deployment platform. Amazon's CEO, Andy Jassy, stated in a February 7 earnings call that the optimization and reduced costs in model inference due to DeepSeek could increase overall tech spending. Amazon projected its 2025 capital expenditure to reach $100 billion, up from $83 billion in 2024, driven primarily by AI advancements.
Meta
Meta
Meta experienced a significant stock price rebound on the date of the AI model announcements, closing with a 1.91% gain. However, internally, there was concern about DeepSeek's low training cost; a Meta engineer mentioned it caused worry among management regarding their own extensive AI investments. CEO Mark Zuckerberg responded to these concerns by emphasizing their strategic advantage in substantial infrastructure investments despite the lowered training requirements potentially highlighted by DeepSeek's model.
OpenAI
OpenAI
OpenAI faced significant competition from Chinese company DeepSeek, which launched models rivaling its own with lower computational resources. Despite launching advanced models like o1, OpenAI's market lead was challenged, prompting CEO Sam Altman to acknowledge DeepSeek's impact. OpenAI advocates for export restrictions on advanced models and emphasizes leveraging closed-source strategies to maintain technological advantage, but is pressured by DeepSeek's open-source success.
Anthropic
Anthropic
Anthropic is a competitor to OpenAI and its CEO, Dario Amodei, indicated skepticism about DeepSeek's achievements. He suggested that DeepSeek's model capabilities do not surpass those of Anthropic's Claude 3.5 Sonnet, which was trained months before DeepSeek's release. He also advocated for strict U.S. export controls to prevent China from obtaining advanced AI capabilities, emphasizing the need for extensive resources to realize AI's full potential.
TSMC
台积电
The article mentions that on January 27, TSMC (Taiwan Semiconductor Manufacturing Company) experienced a significant stock drop of 13.33% amid a broader sell-off in U.S. chip stocks. This was triggered by the impact of the Chinese AI company DeepSeek on the market, shaking the perceived dominance of U.S. firms in the AI sector and leading to significant financial recalibrations.
Micron
美光
According to the article, Micron (NASDAQ: MU) experienced a significant stock drop of 11.71% on January 27, following the impactful debut of the Chinese AI company DeepSeek. This event led to a broader sell-off in the U.S. chip sector.
Broadcom
博通
Broadcom (NASDAQ: AVGO) experienced a significant stock price drop amid a broader sell-off in chip stocks influenced by the rise of DeepSeek, a Chinese AI company. On January 27, Broadcom's shares fell by 17.40%, alongside other major chip companies like NVIDIA and TSMC, as investors reacted to the potential impact of DeepSeek's advancements on the AI industry and related markets.
ARM
ARM
On January 27, ARM (NASDAQ: ARM) experienced a significant drop in the stock market, with shares falling 10.19% as part of a broader decline among U.S. chip stocks. This occurred amid a major sell-off triggered by the rise of the Chinese AI company DeepSeek, which caused financial institutions to reassess their investments in chip companies like ARM.
ASML
阿斯麦
The article mentions ASML in the context of AI hardware and technological investments. ASML is highlighted as a dominant company in the photolithography machine industry, essential for producing GPUs. The discussion includes concerns about the long-term competitiveness of firms like ASML, as it stands out with little competition, and the challenges in replicating the rapid performance improvements once common in technology sectors, such as those seen during the PC era.
ByteDance
字节跳动
After the launch of DeepSeek-V2, a price war in the sector was triggered. ByteDance, along with other key firms like Alibaba Cloud and Zhipu AI, responded by lowering their prices. Moreover, on February 4, after the DeepSeek phenomenon, ByteDance's platform Volcano Engine announced the launch of the full-version DeepSeek models.
Alibaba Cloud
阿里云
After DeepSeek gained popularity, Alibaba Cloud announced on January 28th that it would open source its visual understanding model Qwen2.5-VL and upgrade its flagship model Qwen2.5-Max to compete with DeepSeek-V3. On February 3rd, Alibaba Cloud also released distilled versions of DeepSeek-V3 and R1 models, promoting "three-step, zero-code" deployment, and offered one-click deployment services for the original-sized models in its model library.
Baidu
百度
Following DeepSeek's rise, Baidu Cloud announced on February 3 the launch of deployment services for distilled versions of DeepSeek-V3 and R1 models, offering free model invocation for two weeks and highlighting its "super low" pricing strategy.
Tencent
腾讯
Tencent Cloud launched a deployment tool for the DeepSeek R1 model on February 2, 2025, supporting distilled versions with parameters 1.5B, 7B, 8B, and 14B. The company claims that developers can integrate these models in just "two steps and within three minutes." The tool highlights model "distillation," a compression technique to reduce model complexity and parameter count for efficient deployment.
Huawei
华为
Huawei's Ascend chip was used to adapt and launch DeepSeek-V3 full-size model services, taking advantage of the large inventory and lower cost compared to NVIDIA chips. Huawei's team worked over the Lunar New Year to handle DeepSeek's traffic surge and launched inference services by February 1. Their efforts underscore the rising relevance of domestic chips amid increasing U.S. export restrictions.
Silicon Flow
硅基流动
Silicon Flow, founded by Yuan Jinhui, provides inference deployment services for open-source large models. They quickly launched inference services for DeepSeek-V2 using Nvidia H100 GPUs, achieving faster inference speeds than DeepSeek's official offerings. Despite not foreseeing DeepSeek's explosive success, they collaborated with Huawei to manage the increased traffic and released full-sized DeepSeek model inference services by February 1st, 2025, noting the significant user growth due to DeepSeek.
AI generated, for reference only
Subscribe to unlock Digest Hub
SUBSCRIBE NOW
PODCAST
China Business Uncovered Podcast: Inside the Fall of ‘China’s LVMH’
00:00
00:00/00:00