Goodbye DeepSeek? U.S. Researchers Unveil Breakthrough $50 AI Reasoning Model

Image by Unsplash

This post is also available in: עברית (Hebrew)

In a groundbreaking achievement, a team of researchers from Stanford University and the University of Washington has developed an advanced AI reasoning model, named s1, for an astonishingly low cost of under $50. This milestone is particularly significant in an industry where developing similar models often requires millions of dollars in resources.

The s1 model is designed to tackle complex reasoning tasks, and has performed on par with leading models such as OpenAI’s o1 and DeepSeek’s R1 in tasks involving math and coding.

This breakthrough comes amid growing competition in the AI reasoning field. Chinese startup DeepSeek recently made waves with its own reasoning model, R1, which the company claims to have been developed for just $6 million. However, some critics have questioned the accuracy of DeepSeek’s claims, and also expressed their concerns regarding the safety and security of its models.

The low cost of s1’s development can be attributed to a technique called “distillation.” This process involves training s1 to mimic the reasoning abilities of an existing AI model—specifically, Google’s Gemini 2.0 Flash Thinking Experimental model. By utilizing a curated dataset of 1,000 questions and answers, paired with reasoning traces from the Gemini model, s1 learned to arrive at accurate solutions in a fraction of the time and cost compared to traditional methods.

According to the researchers, training s1 took just 26 minutes using 16 Nvidia H100 GPUs, costing a mere $20 in total. The researchers leveraged Supervised Fine-Tuning (SFT), a method that involves guiding the model with explicit instructions to accelerate the learning process.

An intriguing development during s1’s creation was the introduction of a “wait” instruction, which helped improve its accuracy. By incorporating pauses into the model’s reasoning process, the researchers found that s1 was able to double-check its responses, often correcting errors and leading to more accurate conclusions.

Ultimately, the researchers behind s1 hope their work will drive open innovation, making powerful reasoning models more accessible to the global community and accelerating advancements in AI technology for the benefit of society.

The team’s research can be found on arXiv.