DeepSeek might not be as disruptive as claimed, firm reportedly has 50,000 Nvidia GPUs and spent $1.6 billion on buildouts
The fabled $6 million was just a portion of the total training cost.
Chinese startup DeepSeek recently took center stage in the tech world with its startlingly low usage of compute resources for its advanced AI model called R1, a model that is believed to be competitive with Open AI's o1 despite the company's claims that DeepSeek only cost $6 million and 2,048 GPUs to train. However, industry analyst firm SemiAnalysis reports that the company behind DeepSeek incurred $1.6 billion in hardware costs and has a fleet of 50,000 Nvidia Hopper GPUs, a finding that undermines the idea that DeepSeek reinvented AI training and inference with dramatically lower investments than the leaders of the AI industry.
DeepSeek operates an extensive computing infrastructure with approximately 50,000 Hopper GPUs, the report claims. This includes 10,000 H800s and 10,000 H100s, with additional purchases of H20 units, according to SemiAnalysis. These resources are distributed across multiple locations and serve purposes such as AI training, research, and financial modeling. The company's total capital investment in servers is around $1.6 billion, with an estimated $944 million spent on operating costs, according to SemiAnalysis.
DeepSeek took the attention of the AI world by storm when it disclosed the minuscule hardware requirements of its DeepSeek-V3 Mixture-of-Experts (MoE) AI model that are vastly lower when compared to those of U.S.-based models. Then DeepSeek shook the high-tech world with an Open AI-competitive R1 AI model. However, the reputable market intelligence company SemiAnalysis revealed its findings that indicate the company has some $1.6 billion worth of hardware investments.
DeepSeek originates from High-Flyer, a Chinese hedge fund that adopted AI early and heavily invested in GPUs. In 2023, High-Flyer launched DeepSeek as a separate venture solely focused on AI. Unlike many competitors, DeepSeek remains self-funded, giving it flexibility and speed in decision-making. Despite claims that it is a minor offshoot, the company has invested over $500 million into its technology, according to SemiAnalysis.
A major differentiator for DeepSeek is its ability to run its own data centers, unlike most other AI startups that rely on external cloud providers. This independence allows for full control over experiments and AI model optimizations. In addition, it enables rapid iteration without external bottlenecks, making DeepSeek highly efficient compared to traditional players in the industry.
Then there is something that one would not expect from a Chinese company: talent acquisition from mainland China, with no poaching from Taiwan or the U.S. DeepSeek exclusively hires from within China, focusing on skills and problem-solving abilities rather than formal credentials, according to SemiAnalysis. Recruitment efforts target institutions like Peking University and Zhejiang University, offering highly competitive salaries. According to the research, some AI researchers at DeepSeek earn over $1.3 million, exceeding compensation at other leading Chinese AI firms such as Moonshot.
Due to the talent inflow, DeepSeek has pioneered innovations like Multi-Head Latent Attention (MLA), which required months of development and substantial GPU usage, SemiAnalysis reports. DeepSeek emphasizes efficiency and algorithmic improvements over brute-force scaling, reshaping expectations around AI model development. This approach has, for many reasons, led some to believe that rapid advancements may reduce the demand for high-end GPUs, impacting companies like Nvidia.
Stay On the Cutting Edge: Get the Tom's Hardware Newsletter
Get Tom's Hardware's best news and in-depth reviews, straight to your inbox.
A recent claim that DeepSeek trained its latest model for just $6 million has fueled much of the hype. However, this figure refers only to a portion of the total training cost— specifically, the GPU time required for pre-training. It does not account for research, model refinement, data processing, or overall infrastructure expenses. In reality, DeepSeek has spent well over $500 million on AI development since its inception. Unlike larger firms burdened by bureaucracy, DeepSeek’s lean structure enables it to push forward aggressively in AI innovation, SemiAnalysis believes.
DeepSeek's rise underscores how a well-funded, independent AI company can challenge industry leaders. However, the public discourse might have been driven by hype. Reality is more complex: SemiAnalysis contends that DeepSeek’s success is built on strategic investments of billions of dollars, technical breakthroughs, and a competitive workforce. What it means is that there are no wonders. As Elon Musk noted a year or so ago, if you want to be competitive in AI, you have to spend billions per year, which is reportedly in the range of what was spent.
Anton Shilov is a contributing writer at Tom’s Hardware. Over the past couple of decades, he has covered everything from CPUs and GPUs to supercomputers and from modern process technologies and latest fab tools to high-tech industry trends.
-
JTWrenn First rule of tech when dealing with Chinese companies. They are part of the state and the state has a vested interest in making the USA and Europe look bad. Triple check their numbers. Do the same for Elon.Reply -
alrighty_then I'm not shocked but didn't have enough confidence to buy more NVIDIA stock when I should have. Now Monday morning will be a race to sell airline stocks and buy some big green before everyone else does.Reply -
quorm This is just cope aiming to protect the inflated value of "AI" companies. It doesn't really matter how many GPU's they have or their parent company has. The real disruptive part is releasing the source and weights for their models.Reply