Friday, 1 May 2026
Subscribe
logo
  • AI Compute
  • Infrastructure
  • Power & Cooling
  • Security
  • Colocation
  • Cloud Computing
  • More
    • Sustainability
    • Industry News
    • About Data Center News
    • Terms & Conditions
Font ResizerAa
Data Center NewsData Center News
Search
  • AI Compute
  • Infrastructure
  • Power & Cooling
  • Security
  • Colocation
  • Cloud Computing
  • More
    • Sustainability
    • Industry News
    • About Data Center News
    • Terms & Conditions
Have an existing account? Sign In
Follow US
© 2022 Foxiz News Network. Ruby Design Company. All Rights Reserved.
Data Center News > Blog > AI & Compute > DeepCoder delivers top coding performance in efficient 14B open model
AI & Compute

DeepCoder delivers top coding performance in efficient 14B open model

Last updated: April 11, 2025 3:07 am
Published April 11, 2025
Share
DeepCoder delivers top coding performance in efficient 14B open model
SHARE

Be part of our every day and weekly newsletters for the newest updates and unique content material on industry-leading AI protection. Study Extra


Researchers at Together AI and Agentica have launched DeepCoder-14B, a brand new coding mannequin that delivers spectacular efficiency corresponding to main proprietary fashions like OpenAI’s o3-mini. 

Constructed on high of DeepSeek-R1, this mannequin offers extra flexibility to combine high-performance code technology and reasoning capabilities into real-world functions. Importantly, the groups have totally open-sourced the mannequin, its coaching knowledge, code, logs and system optimizations, which may help researchers enhance their work and speed up progress.

Aggressive coding capabilities in a smaller package deal

The analysis group’s experiments present that DeepCoder-14B performs strongly throughout a number of difficult coding benchmarks, together with LiveCodeBench (LCB), Codeforces and HumanEval+.

“Our mannequin demonstrates robust efficiency throughout all coding benchmarks… corresponding to the efficiency of o3-mini (low) and o1,” the researchers write in a blog post that describes the mannequin.

Curiously, regardless of being skilled totally on coding duties, the mannequin reveals improved mathematical reasoning, scoring 73.8% on the AIME 2024 benchmark, a 4.1% enchancment over its base mannequin (DeepSeek-R1-Distill-Qwen-14B). This means that the reasoning expertise developed via RL on code will be generalized successfully to different domains.

DeepCoder-14B performance
Credit score: Collectively AI

Probably the most hanging side is attaining this degree of efficiency with solely 14 billion parameters. This makes DeepCoder considerably smaller and doubtlessly extra environment friendly to run than many frontier fashions.

Improvements driving DeepCoder’s efficiency

Whereas growing the mannequin, the researchers solved a number of the key challenges in coaching coding fashions utilizing reinforcement studying (RL).

See also  A standard, open framework for building AI agents is coming from Cisco, LangChain and Galileo

The primary problem was curating the coaching knowledge. Reinforcement studying requires dependable reward indicators indicating the mannequin’s output is appropriate. Because the researchers level out, “In contrast to math—the place plentiful high-quality, verifiable knowledge is available on the Web—the coding area suffers from a relative shortage of such knowledge.” 

To handle this drawback, the DeepCoder group carried out a strict pipeline that gathers examples from completely different datasets and filters them for validity, complexity and duplication. This course of yielded 24,000 high-quality issues, offering a stable basis for efficient RL coaching.

The group additionally designed an easy reward operate that solely offers a constructive sign if the generated code passes all sampled unit checks for the issue inside a selected time restrict. Mixed with the high-quality coaching examples, this outcome-focused reward system prevents the mannequin from studying methods like printing memorized solutions for public checks or optimizing for easy edge circumstances with out fixing the core drawback.

The mannequin’s core coaching algorithm relies on Group Relative Coverage Optimization (GRPO), a reinforcement studying algorithm that proved very profitable in DeepSeek-R1. Nonetheless, the group made a number of modifications to the algorithm to make it extra secure and permit the mannequin to proceed bettering because the coaching extends for an extended time.

GRPO+
GRPO+ permits DeepCoder-14 to proceed for longer durations with out collapsing Credit score: Collectively AI

Lastly, the group prolonged the mannequin’s context window iteratively, first coaching it on shorter reasoning sequences and progressively growing the size. Additionally they developed a filtering methodology to keep away from penalizing the mannequin when it created reasoning chains that exceeded the context limits when fixing a tough immediate. 

See also  AI and quantum threats top security agendas
iterative context extension
DeepCoder was skilled on 32K context issues however was additionally in a position to resolve 64K duties Credit score: Collectively AI

The researchers clarify the core concept: “To protect long-context reasoning whereas enabling environment friendly coaching, we included overlong filtering… This system masks out truncated sequences throughout coaching in order that fashions aren’t penalized for producing considerate however prolonged outputs that exceed the present context restrict.” 

The coaching was progressively scaled from a 16K to a 32K context window, and the ensuing mannequin might additionally resolve issues that required as much as 64K tokens.

Optimizing long-context RL coaching

Coaching massive fashions with RL, particularly on duties requiring lengthy generated sequences like coding or advanced reasoning, is computationally intensive and sluggish. A serious bottleneck is the “sampling” step, the place the mannequin generates doubtlessly hundreds of tokens per instance within the batch. Variations in response size imply some responses end a lot later than others, leaving GPUs idle and slowing down the whole coaching loop. 

To speed up this, the group developed verl-pipeline, an optimized extension of the open-source verl library for reinforcement learning from human feedback (RLHF). The important thing innovation, which they name “One-Off Pipelining,” rearranges the response sampling and mannequin updates to scale back the bottlenecks and accelerator idle time.

One-Off Pipelining
One-Off Pipelining

Their experiments confirmed that one-off pipelining offered as much as a 2x speedup for coding RL duties in comparison with baseline implementations. This optimization was essential for coaching DeepCoder inside an affordable timeframe (2.5 weeks on 32 H100s) and is now open-sourced as a part of verl-pipeline for the neighborhood to make use of and construct upon. 

See also  Claude faces ‘industrial-scale’ AI model distillation

Enterprise impression

The researchers have made all of the artifacts for coaching and working DeepCoder-14B accessible on GitHub and Hugging Face beneath a permissive license.

“By totally sharing our dataset, code, and coaching recipe, we empower the neighborhood to breed our work and make RL coaching accessible to all,” the researchers write.

DeepCoder-14B powerfully illustrates a broader, accelerating development within the AI panorama: the rise of extremely succesful but environment friendly and overtly accessible fashions. 

For the enterprise world, this shift signifies extra choices and better accessibility of superior fashions. Chopping-edge efficiency is not solely the area of hyperscalers or these prepared to pay premium API charges. Fashions like DeepCoder can empower organizations of all sizes to leverage subtle code technology and reasoning, customise options to their particular wants, and securely deploy them inside their environments. 

This development can decrease the barrier to entry for AI adoption and foster a extra aggressive and modern ecosystem, the place progress is pushed via open supply collaboration.


Source link
TAGGED: 14B, coding, DeepCoder, Delivers, efficient, Model, Open, performance, Top
Share This Article
Twitter Email Copy Link Print
Previous Article Google unveils Cloud WAN and Gemini Tools to simplify app development Google Cloud Next 2025 Google unveils Cloud WAN and Gemini Tools to simplify app development at Google Cloud Next 25
Next Article Photo of Earth as the International Energy Agency (IEA) explores the opportunities and challenges brought about by AI with regards to global energy, including meeting demand from investments in data centres and other infrastructure. The opportunities and challenges of AI for global energy
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Your Trusted Source for Accurate and Timely Updates!

Our commitment to accuracy, impartiality, and delivering breaking news as it happens has earned us the trust of a vast audience. Stay ahead with real-time updates on the latest events, trends.
FacebookLike
TwitterFollow
InstagramFollow
YoutubeSubscribe
LinkedInFollow
MediumFollow
- Advertisement -
Ad image

Popular Posts

Google’s native multimodal AI image generation in Gemini 2.0 Flash impresses with fast edits, style transfers

Be part of our day by day and weekly newsletters for the most recent updates…

March 13, 2025

Enterprise AI adoption shifts to agentic systems

In line with Databricks, enterprise AI adoption is shifting to agentic programs as organisations embrace…

January 27, 2026

Study claims OpenAI trains AI models on copyrighted data

A brand new research from the AI Disclosures Project has raised questions concerning the information…

April 4, 2025

Smart Spatial and Hyperview unite

Good Spatial is worked up to announce its partnership with Hyperview, the main cloud-based DCIM…

January 17, 2025

OpenAI just fixed ChatGPT’s most annoying business problem: meet the PDF export that changes everything

Be a part of our every day and weekly newsletters for the most recent updates…

May 13, 2025

You Might Also Like

STL launches Neuralis data centre connectivity suite in the U.S.
AI & Compute

STL launches Neuralis data centre connectivity suite in the U.S.

By saad
What is optical interconnect and why Lightelligence's $10B debut says it matters for AI
AI & Compute

What is optical interconnect and why Lightelligence’s $10B debut says it matters for AI

By saad
IBM launches AI platform Bob to regulate SDLC costs
AI & Compute

IBM launches AI platform Bob to regulate SDLC costs

By saad
The evolution of encoders: From simple models to multimodal AI
AI & Compute

The evolution of encoders: From simple models to multimodal AI

By saad

About Us

Data Center News is your dedicated source for data center infrastructure, AI compute, cloud, and industry news.

Top Categories

  • AI & Compute
  • Cloud Computing
  • Power & Cooling
  • Colocation
  • Security
  • Infrastructure
  • Sustainability
  • Industry News

Useful Links

  • Home
  • Contact
  • Privacy Policy
  • Terms & Conditions

Find Us on Socials

© 2026 Data Center News. All Rights Reserved.

© 2026 Data Center News. All Rights Reserved.
Welcome Back!

Sign in to your account

Lost your password?
We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will assume that you are happy with it.
You can revoke your consent any time using the Revoke consent button.