o3-mini is here, with the momentum of a challenger
On January 31, OpenAI released the brand new o3-mini large model and provided some of its functions for free to all ChatGPT users. Although there is a limit on the number of queries, it allows users to experience OpenAI’s latest commercial model as soon as possible.
Just a few days ago, DeepSeek, a large model company from China, released its latest open source model, DeepSeek-R1, which has also established its own influence in the AI community.
The DeepSeek-R1 model has the ability to match the open ai o1 model, but it is cheaper. More importantly, DeepSeek R1 is an open source model, which is the biggest difference compared to openai.
The question is: is o3-mini really better than DeepSeek-R1?
In the official data comparison provided by OpenAI, only some of the models released by OpenAI are compared, and the results are not directly compared with those of the large DeepSeek R1 model. However, some newly released benchmark test data shows that o3-mini is slightly better in many ways. We can understand this situation by looking at the scores of different tests.
Let’s let the data speak for itself and analyze in depth the true strength of these two AI models. Sometimes data is one thing, but more often it also depends on the actual experience and use of the user.
Data comparison: o3-mini is smarter, but DeepSeek-R1 is more “mathematical”
Overall average score
OpenAI o3-mini: 73.94
DeepSeek-R1: 71.38
It is clear that o3-mini’s overall score is slightly higher, which indicates that it performs more stably in comprehensive tasks. It can complete tasks more stably, but it does not have a large gap with DeepSeek’s open source model.
Reasoning ability (AI’s ability to understand, analyze, and reason about information)
OpenAI o3-mini: 89.58
DeepSeek-R1: 83.17
In reasoning tasks, o3-mini clearly wins, which means it is better at extracting key content from complex information and making logical inferences.
Programming ability (AI’s ability to process code)
OpenAI o3-mini: 82.74
DeepSeek-R1: 66.74
If you are a developer, o3-mini may be a better choice. The scores show a large difference, with o3-mini’s coding ability significantly ahead of DeepSeek-R1, and being able to better understand and solve programming problems. This is also an area where o3-mini has a relatively large advantage
Mathematical ability (calculation, formula derivation, mathematical reasoning)
OpenAI o3-mini: 65.65
DeepSeek-R1: 79.54
DeepSeek-R1 is stronger at mathematical tasks, indicating that it is better at numerical calculations and mathematical reasoning.
Data analysis skills (ability to process and understand data)
OpenAI o3-mini: 70.64
DeepSeek-R1: 69.78
o3-mini has a slight lead in data analysis tasks.
Language comprehension skills
OpenAI o3-mini: 50.68
DeepSeek-R1: 48.53
Although the advantage is not great, o3-mini still slightly outperforms in language tasks.
NYT Connections (puzzle)
o3-mini: 72.4 points (excellent performance)
DeepSeek-R1: 54.4 points
Human Final Exam (complex task)
o3-mini: 13.0% accuracy
DeepSeek-R1: 9.4% accuracy
Codeforces (programming aptitude test)
o3-mini > DeepSeek-R1 AIME 2024 (complex instruction comprehension)
o3-mini > DeepSeek-R1 In summary, o3-mini is stronger in reasoning, programming, and languages, while DeepSeek-R1 is more advantageous in mathematical ability.
API price comparison: who is more cost-effective?
DeepSeek-R1 is cheaper in terms of API prices, while o3-mini is still relatively expensive:
DeepSeek-R1 is cheaper and is therefore suitable for developers on a budget.
Open source vs. closed source: OpenAI is still closed
If you are concerned about open source, DeepSeek-R1 is a better choice. It is completely open source, while o3-mini still follows the tradition of OpenAI and remains closed. This may affect the freedom of developers in terms of model optimization and customization.
Final conclusion: who is more worthy of choice?
Dimension | o3-mini (OpenAI) | DeepSeek-R1 |
Overall score | 73.94 | 71.38 |
Inferencing | 89.58 (stronger) | 83.17 |
Programming | 82.74 (stronger) | 66.74 |
Mathematics | 65.65 | 79.54 |
Data analysis | 70.64 | 69.78 |
Language understanding | 50.68 | 48.53 |
API price | More expensive | cheaper |
Open source | close | Fully open source |
Who is it for?
- If you are a developer or engineer and need strong programming and inference capabilities, the o3-mini is the better choice. We believe that the open and O3mini have a very good performance in this area of identification and inference. At the same time, the more powerful programming and inference capabilities can also help you write better code and programs, reducing your time for modification and inspection
- If you are a mathematical researcher or sensitive to API costs, DeepSeek-R1 is a more economical choice. This model has better support and assistance for mathematical researchers, and has a more suitable cost of use
- If you need an open source model, DeepSeek-R1 is the winner. Obviously, meta, which focuses on open source, is not comparable to DeepSeek in some capabilities. However, the comparable openAI large model is more expensive and is a commercial closed source model. DeepSeek will lead the research and development of AI, while allowing more companies and individual users to deploy AI large models locally or on cloud servers, protecting the security and privacy of their data
Future outlook: competition for AI models is intensifying
Both OpenAI and DeepSeek are driving the development of AI technology. Although o3-mini is currently slightly better at most tasks, DeepSeek-R1 still has its own unique advantages.
The open source nature of DeepSeek has attracted the attention of many developers and users. The lower price also lays a good foundation for the development of AI applications.
In contrast, OpenAI, as a leader in the AI industry, has a lot of innovation and development, but the non-open commercial model and high cost of use have raised the threshold for use, which is not conducive to the promotion of AI.
We think deepseek did a great work for the AI industry. Open source will give developers more chance to know more about the advanced Ai model.
In the future, we may see the emergence of even more powerful models, such as OpenAI’s GPT-5 or DeepSeek-R2. For ordinary users, the best AI is not the “strongest” AI, but the AI that best suits their needs. When choosing an AI model that suits you, you must consider your own application scenarios and budget.