Tag: reinforcement

Ai2's new Olmo 3.1 extends reinforcement learning training for stronger reasoning benchmarks

Ai2's new Olmo 3.1 extends reinforcement learning training for stronger reasoning benchmarks

The Allen Institute for AI (Ai2) lately launched what it calls its strongest household of fashions but, Olmo…

By saad

Inside Ring-1T: Ant engineers solve reinforcement learning bottlenecks at trillion scale

Inside Ring-1T: Ant engineers solve reinforcement learning bottlenecks at trillion scale

China’s Ant Group, an affiliate of Alibaba, detailed technical data round its new mannequin, Ring-1T, which the corporate…

By saad

Person holding popcorn as Alibaba unveils Qwen QwQ-32B — a 32 billion parameter AI model that demonstrates performance rivalling the much larger DeepSeek-R1. This breakthrough highlights the potential of scaling Reinforcement Learning (RL) on robust foundation models.

Alibaba Qwen QwQ-32B: Scaled reinforcement learning showcase

The Qwen crew at Alibaba has unveiled QwQ-32B, a 32 billion parameter AI mannequin that demonstrates efficiency rivalling…

By saad

Open-source DeepSeek-R1 uses pure reinforcement learning to match OpenAI o1 — at 95% less cost

Open-source DeepSeek-R1 uses pure reinforcement learning to match OpenAI o1 — at 95% less cost

Be part of our every day and weekly newsletters for the newest updates and unique content material on…

By saad