Sunday, January 26, 2025

China’s New AI Model DeepSeek Just Won the Tech Race. DeepSeek-R1 performs reasoning tasks at the same level as Open AI’s o1 — and is open for researchers to examine.




This is not an exaggeration to suggest that innovation is part of Chinese DNA. A Chinese-built large language model is called DeepSeek-R1 is thrilling scientists as an affordable and open rival to ‘reasoning’ models such as OpenAI’s o1.

I noted that these models generate responses step-by-step, in a process analogous to human reasoning. This makes them more adept than earlier language models at solving scientific problems and could make them useful in their research projects. Initial tests of R1, released on 20 January, show that its performance on certain tasks in chemistry, mathematics, and coding is on par with that of o1 — which wowed researchers when it was released by OpenAI in September.

“This is wild and totally unexpected,” Elvis Saravia, an AI researcher and co-founder of the UK-based AI consulting firm DAIR.AI, wrote on his social platform X.

There is another reason R1 stands out itself. DeepSeek, the start-up in Hangzhou that built the model, has released it as ‘open-weight’, meaning that researchers can study and build on the algorithm. Published under an MIT license, the model can be freely reused but is not considered fully open source, because its training data has not been made available.

“The openness of DeepSeek is quite remarkable,” says Mario Krenn, leader of the Artificial Scientist Lab at the Max Planck Institute for the Science of Light in Erlangen, Germany. By comparison, o1 and other models built by OpenAI in San Francisco, California, including its latest effort o3 are “essentially black boxes”, he says. Read more



No comments: