Advancements

Has OpenAI finally met its match?

A Chinese AI lab has released a new AI model capable of reasoning that matches OpenAI’s o1

Martin Crowley
November 21, 2024

Chinese AI lab, DeepSeek, has revealed a preview of a new AI model, called DeepSeek-R1, that’s reportedly capable of human-like reasoning that matches OpenAI’s own reasoning model, o1. 

Both models spend up to 10 seconds ‘thinking’, planning, evaluating, fact-checking, and executing multiple actions before providing their responses (like a human would), which helps them deliver better, more accurate, and relevant responses, reducing common AI model pitfalls like inaccurate and nonsensical answers to queries. 

According to popular benchmark tests, DeepSeek-R1 is on par with o1 on AIME tests (which tests AI model performance using other AI models)  and MATH tests (which tests the AI model's ability to solve word problems, focusing on mathematical reasoning). Although both still struggle to solve basic logical problems like tic-tac-toe, for example. 

Although its performance seems to be matched, DeepSeek-R1 does have two weaknesses:

  1. Users testing the model found they were able to bypass some of its safeguards and, as a result, were able to extract a detailed recipe for methamphetamine, for example.

  2. Users also found it avoided answering politically sensitive questions, which may be because of Chinese government pressures to build AI models that “embody core socialist values.”

The release of reasoning models like DeepSeek-R1 and o1 marks a change in AI model development, as reports show the rate of improvement is slowing, largely due to a lack of untapped real-world data. As a result, tech companies like OpenAI and DeepSeek are turning to new methods, such as test-time compute, which gives AI models more processing time to help them complete tasks more effectively.