This post is also available in:
עברית (Hebrew)
China’s rapidly rising AI startup, DeepSeek, is causing quite a stir in the tech world. The company recently launched its V3 and R1 models, aimed at surpassing leading global players like OpenAI. While DeepSeek boasts about the relatively low cost of training its V3 model, which the company claims only cost $5.6 million, a deeper look at the numbers paints a more complicated picture.
In a report by Semianalysis, it’s revealed that the $5.6 million figure is just the pre-training cost — excluding research, maintenance, operation, and hardware. DeepSeek also shares its computing resources with its parent company, High-Flyer hedge fund, which invested over $500 million into Nvidia GPUs alone. The total cost for DeepSeek’s server infrastructure is estimated to be an eye-watering $1.6 billion, with an additional $944 million spent on operating the clusters. In addition to this new estimate, which far surpasses the company’s initial statements, it was recently revealed that OpenAI claims that DeepSeek had illegally used its data to train its own AI model.
So, while DeepSeek’s numbers may look good on paper, the true expenses tell a different story — especially when considering the full scope of development. The company apparently also hires employees only from China’s universities, reportedly offering salaries of up to $1.3 million USD per year to top talent. With just 150 employees, DeepSeek is clearly operating with a lean but highly effective team. Another edge DeepSeek’s has over both Western and some Chinese competitors lies in its in-house data centers, which allow for more agile experimentation and quicker iteration due to not having to rely on outside providers.
Ultimately, while DeepSeek’s models are impressive, they aren’t necessarily beating other models in the market, especially OpenAI’s latest O3 model. DeepSeek’s rapid development and lower operational costs make for a compelling story, but when you scratch beneath the surface, it seems the full picture is more complex than the startup’s claims suggest.