Site icon TechPolyp

Alibaba Releases New AI Model, “QwQ-32B”, Claims It Outperforms OpenAI And Deepseek

QwQ-32B Preview

QwQ-32B Preview Credit: Digital Watch Observatory

When you purchase through links on our site, we may earn an affiliate commission. This doesn’t affect our editorial independence.

Alibaba’s new open-source model performed better than DeepSeek’s R1 in areas like arithmetic, coding, and general problem-solving.

Alibaba, a Chinese software and e-commerce giant introduced a new artificial intelligence (AI) model on Thursday, March 6. It says the model can handle complex problems just as well as DeepSeek. It added that its new model surpasses Deepseek in efficiency since it needs a lot less data. The company claimed that its new QwQ-32B compact reasoning model is “comparable” to other more advanced, larger models, such as OpenAI’s o1-mini.

“Today, we release QwQ-32B, our new reasoning model with only 32 billion parameters, delivering performance comparable to other larger cutting edge models,” the Alibaba Group, which was founded by billionaire Jack Ma, stated in an X post.

Alibaba’s AI model comes months after another Chinese startup, DeepSeek, launched a low-cost AI model in January. Deepseek competed very favorably with OpenAI’s masterpiece, Chat GPT.

Alibaba’s new AI model is a build-on of its earlier model, Qwen 2.5. Its most recent AI language model can process text, images, and audio. It can also analyze complex data, spot patterns, and come up with answers in a style similar to humans. The company claims that its QwQ-32B open-source generative-AI model performed better than DeepSeek’s R1 in domains like mathematics, coding, and general problem-solving.

QwQ-32B Developed With Reinforcement Learning

There are essentially three categories of AI models: Unsupervised learning, Supervised learning and Reinforcement learning. The company’s new AI model was developed with the reinforcement learning process. Reinforcement learning (RL) is a machine learning technique that teaches AI systems to learn through trial and error.

“Our research explores the scalability of Reinforcement Learning (RL) and its impact on enhancing the intelligence of large language models,”

“Furthermore, we have integrated agent-related capabilities into the reasoning model, enabling it to think critically while utilizing tools and adapting its reasoning based on environmental feedback,” the team developing Alibaba AI models stated.

“These advancements not only demonstrate the transformative potential of RL but also pave the way for further innovations in the pursuit of artificial general intelligence,” they added.

The company is optimistic that it will be able to achieve Artificial General Intelligence (AGI) in the future by “combining stronger foundation models with RL, powered by scaled computational resources.” “Additionally, we are actively exploring the integration of agents with RL to enable long-horizon reasoning, aiming to unlock greater intelligence with inference time scaling,” they explained.

Exit mobile version