DeepSeek R1 is an open source model. DeepSeek is a Chinese artificial intelligence research firm backed by High-Flyer Capital Management, a quantitative hedge fund focused on the applications of artificial intelligence to trading decisions. They released models under open source licenses resembling MIT.
How they equaled and even surpassed OpenAI’s O1:
Emphasis on reinforcement learning: DeepSeek-R1 and its variant DeepSeek-R1-Zero were developed using a reinforcement learning (RL) approach, a departure from traditional methods that usually depend on supervised learning. This method allowed the model to develop its reasoning capabilities autonomously, without initially counting on human-annotated datasets. This approach proved to be effective, enabling the model to attain high performance on inference tasks.
Benchmark Performance: DeepSeek-R1-Lite-Preview demonstrated comparable or higher performance to OpenAI’s O1 in several benchmarks resembling AIME and MATH that deal with mathematical reasoning and problem solving. These results are attributed to DeepSeek’s use of chain-of-thought reasoning, where the model clearly shows the reasoning process, which not only helps provide transparency but additionally refines the model’s approach to complex problems.
DeepSeek’s first-generation inference models achieve performance comparable to OpenAI’s o1 on math, coding, and inference tasks!
Try it! 👇
7B distilled:
ollama run deepseek-r1:7bMore distilled sizes can be found. 🧵 pic.twitter.com/FdF1U3qvev
— ollama (@ollama) January 20, 2025
Reinforcement learning works well for tasks requiring sequential decision-making, where the AI must learn to take a series of actions to attain a goal. The goal of DeepSeek-R1 is to generate consistent, contextually appropriate responses in conversational AI or other interactive applications. Reinforcement learning allows DeepSeek-R1 to learn to optimize long-term outcomes, not only immediate rewards, which is crucial for maintaining context and consistency over longer interactions.
The DeepSeek R1 model has a 671 billion parameter architecture and was trained on the DeepSeek V3 Base model. Focuses on chain of thought (CoT) reasoning to compete in advanced understanding and reasoning. Only 37 billion parameters are activated during most operations, much like DeepSeek V3.
The DeepSeek R1 ecosystem consists of six models developed from synthetic data from DeepSeek R1 itself. These smaller models vary in size and are designed for specific applications. Developers can use lighter, faster models while maintaining excellent performance.
Deepseek R1 will be downloaded from Github.
DeepSeek 50 times lower cost
DeepSeek hsa achieved great results with significantly less computational resources in comparison with what is usually required to coach models with similar capabilities. DeepSeek offers competitive performance at around 2% of the price, each when it comes to training and inference.


Brian Wang is a futuristic thought leader and popular science blogger with a million monthly readers. His blog Nextbigfuture.com is ranked primary within the Science News Blog rating. It covers many disruptive technologies and trends, including space, robotics, artificial intelligence, medicine, anti-aging biotechnology and nanotechnology.
Known for identifying cutting-edge technologies, he’s currently a co-founder of a startup and fundraiser for high-potential, early-stage corporations. He is the Head of the Allocation Research Department for investments in deep technologies and an Angel Investor at Space Angels.
A frequent corporate speaker, he has been a TEDx speaker, a Singularity University speaker, and a guest on quite a few radio and podcast interviews. He is open to public speaking and giving advice.