Alibaba Cloud’s Qwen group has unveiled Qwen2-Math, a collection of huge language fashions particularly designed to sort out complicated mathematical issues.
These new fashions – constructed upon the prevailing Qwen2 basis – display exceptional proficiency in fixing arithmetic and mathematical challenges, and outperform former business leaders.
The Qwen group crafted Qwen2-Math utilizing an unlimited and numerous Arithmetic-specific Corpus. This corpus includes a wealthy tapestry of high-quality sources, together with internet texts, books, code, examination questions, and artificial information generated by Qwen2 itself.
Rigorous analysis on each English and Chinese language mathematical benchmarks – together with GSM8K, Math, MMLU-STEM, CMATH, and GaoKao Math – revealed the distinctive capabilities of Qwen2-Math. Notably, the flagship mannequin, Qwen2-Math-72B-Instruct, surpassed the efficiency of proprietary fashions resembling GPT-4o and Claude 3.5 in numerous mathematical duties.
“Qwen2-Math-Instruct achieves the very best efficiency amongst fashions of the identical dimension, with RM@8 outperforming Maj@8, notably within the 1.5B and 7B fashions,” the Qwen group famous.
This superior efficiency is attributed to the efficient implementation of a math-specific reward mannequin in the course of the improvement course of.
Additional showcasing its prowess, Qwen2-Math demonstrated spectacular ends in difficult mathematical competitions just like the American Invitational Arithmetic Examination (AIME) 2024 and the American Arithmetic Contest (AMC) 2023.
To make sure the mannequin’s integrity and stop contamination, the Qwen group applied sturdy decontamination strategies throughout each the pre-training and post-training phases. This rigorous strategy concerned eradicating duplicate samples and figuring out overlaps with take a look at units to take care of the mannequin’s accuracy and reliability.
Wanting forward, the Qwen group plans to broaden Qwen2-Math’s capabilities past English, with bilingual and multilingual fashions within the pipeline. This dedication to inclusivity goals to make superior mathematical problem-solving accessible to a world viewers.
“We’ll proceed to boost our fashions’ capability to unravel complicated and difficult mathematical issues,” affirmed the Qwen group.
You could find the Qwen2 fashions on Hugging Face here.
See additionally: Paige and Microsoft unveil next-gen AI fashions for most cancers analysis
Need to study extra about AI and massive information from business leaders? Take a look at AI & Big Data Expo going down in Amsterdam, California, and London. The excellent occasion is co-located with different main occasions together with Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.
Discover different upcoming enterprise expertise occasions and webinars powered by TechForge here.