Saturday, 7 Mar 2026
Subscribe
logo
  • Global
  • AI
  • Cloud Computing
  • Edge Computing
  • Security
  • Investment
  • Sustainability
  • More
    • Colocation
    • Quantum Computing
    • Regulation & Policy
    • Infrastructure
    • Power & Cooling
    • Design
    • Innovations
    • Blog
Font ResizerAa
Data Center NewsData Center News
Search
  • Global
  • AI
  • Cloud Computing
  • Edge Computing
  • Security
  • Investment
  • Sustainability
  • More
    • Colocation
    • Quantum Computing
    • Regulation & Policy
    • Infrastructure
    • Power & Cooling
    • Design
    • Innovations
    • Blog
Have an existing account? Sign In
Follow US
© 2022 Foxiz News Network. Ruby Design Company. All Rights Reserved.
Data Center News > Blog > AI > Meta proposes new scalable memory layers that improve knowledge, reduce hallucinations
AI

Meta proposes new scalable memory layers that improve knowledge, reduce hallucinations

Last updated: January 8, 2025 1:10 am
Published January 8, 2025
Share
Meta proposes new scalable memory layers that improve knowledge, reduce hallucinations
SHARE

Be part of our every day and weekly newsletters for the most recent updates and unique content material on industry-leading AI protection. Be taught Extra


As enterprises proceed to undertake massive language fashions (LLMs) in numerous purposes, one of many key challenges they face is bettering the factual information of fashions and lowering hallucinations. In a brand new paper, researchers at Meta AI suggest “scalable memory layers,” which could possibly be one in all a number of doable options to this downside.

Scalable reminiscence layers add extra parameters to LLMs to extend their studying capability with out requiring extra compute sources. The structure is helpful for purposes the place you may spare additional reminiscence for factual information but in addition need the inference pace of nimbler fashions.

Dense and reminiscence layers

Conventional language fashions use “dense layers” to encode huge quantities of knowledge of their parameters. In dense layers, all parameters are used at their full capability and are largely activated on the similar time throughout inference. Dense layers can be taught advanced capabilities, and rising their requires extra computational and vitality sources. 

In distinction, for easy factual information, a lot less complicated layers with associative reminiscence architectures can be extra environment friendly and interpretable. That is what reminiscence layers do. They use easy sparse activations and key-value lookup mechanisms to encode and retrieve information. Sparse layers take up extra reminiscence than dense layers however solely use a small portion of the parameters directly, which makes them way more compute-efficient.

Reminiscence layers have existed for a number of years however are not often utilized in fashionable deep studying architectures. They don’t seem to be optimized for present {hardware} accelerators. 

See also  Building efficiency upgrades could reduce global energy demand by 12% | DCN

Present frontier LLMs often use some type of “combination of specialists” (MoE) structure, which makes use of a mechanism vaguely much like reminiscence layers. MoE fashions are composed of many smaller knowledgeable elements specializing in particular duties. At inference time, a routing mechanism determines which knowledgeable turns into activated primarily based on the enter sequence. PEER, an structure just lately developed by Google DeepMind, extends MoE to hundreds of thousands of specialists, offering extra granular management over the parameters that turn into activated throughout inference.

Upgrading reminiscence layers

Reminiscence layers are gentle on compute however heavy on reminiscence, which presents particular challenges for present {hardware} and software program frameworks. Of their paper, the Meta researchers suggest a number of modifications that clear up these challenges and make it doable to make use of them at scale.

Memory layers
Reminiscence layers can retailer information in parallel throughout a number of GPUs with out slowing down the mannequin (supply: arXiv)

First, the researchers configured the reminiscence layers for parallelization, distributing them throughout a number of GPUs to retailer hundreds of thousands of key-value pairs with out altering different layers within the mannequin. In addition they applied a particular CUDA kernel for dealing with high-memory bandwidth operations. And, they developed a parameter-sharing mechanism that helps a single set of reminiscence parameters throughout a number of reminiscence layers inside a mannequin. Which means that the keys and values used for lookups are shared throughout layers.

These modifications make it doable to implement reminiscence layers inside LLMs with out slowing down the mannequin.

“Reminiscence layers with their sparse activations properly complement dense networks, offering elevated capability for information acquisition whereas being gentle on compute,” the researchers write. “They are often effectively scaled, and supply practitioners with a beautiful new course to trade-off reminiscence with compute.”

See also  World’s purest silicon paves way towards scalable quantum computers

To check reminiscence layers, the researchers modified Llama fashions by changing a number of dense layers with a shared reminiscence layer. They in contrast the memory-enhanced fashions in opposition to the dense LLMs in addition to MoE and PEER fashions on a number of duties, together with factual query answering, scientific and common sense world information and coding.

Memory model vs dense layers
A 1.3B reminiscence mannequin (strong line) educated on 1 trillion tokens approaches the efficiency of a 7B mannequin (dashed line) on factual question-answering duties as it’s given extra reminiscence parameters (supply: arxiv)

Their findings present that reminiscence fashions enhance considerably over dense baselines and compete with fashions that use 2X to 4X extra compute. In addition they match the efficiency of MoE fashions which have the identical compute price range and parameter depend. The mannequin’s efficiency is very notable on duties that require factual information. For instance, on factual question-answering, a reminiscence mannequin with 1.3 billion parameters approaches the efficiency of Llama-2-7B, which has been educated on twice as many tokens and 10X extra compute. 

Furthermore, the researchers discovered that the advantages of reminiscence fashions stay in keeping with mannequin dimension as they scaled their experiments from 134 million to eight billion parameters.

“Given these findings, we strongly advocate that reminiscence layers needs to be built-in into all subsequent technology AI architectures,” the researchers write, whereas including that there’s nonetheless much more room for enchancment. “Specifically, we hope that new studying strategies might be developed to push the effectiveness of those layers even additional, enabling much less forgetting, fewer hallucinations and continuous studying.”

See also  The first 'Fairly Trained' AI large language model is here

Source link
TAGGED: hallucinations, Improve, Knowledge, Layers, memory, Meta, Proposes, reduce, scalable
Share This Article
Twitter Email Copy Link Print
Previous Article Virtual RAN Still Seems To Be North Worth the Effort Virtual RAN Still Seems To Be North Worth the Effort
Next Article 2,700 Minted On First Day as DeFi-NFT Narrative Gains Traction 2,700 Minted On First Day as DeFi-NFT Narrative Gains Traction
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Your Trusted Source for Accurate and Timely Updates!

Our commitment to accuracy, impartiality, and delivering breaking news as it happens has earned us the trust of a vast audience. Stay ahead with real-time updates on the latest events, trends.
FacebookLike
TwitterFollow
InstagramFollow
YoutubeSubscribe
LinkedInFollow
MediumFollow
- Advertisement -
Ad image

Popular Posts

European cloud providers play the sovereign card

The CLOUD act, enacted in 2018, permits US authorities to compel expertise firms primarily based…

June 2, 2025

Landbase Raises $30M in Series A Funding

Landbase, a San Francisco, CA-based agentic AI firm, raised $30M in Collection A funding. The…

June 15, 2025

Airedale by Modine to open third and largest UK manufacturing plant

Airedale by Modine, the Leeds-headquartered essential cooling specialists, has bought a 14.6 acre manufacturing website…

May 7, 2024

KPN integrates 5G with edge computing to transform industrial operations

KPN, a telecommunications company in the Netherlands, has announced that it has successfully tested 5G…

February 2, 2024

FFGI Announces FFG Token: A New Chapter in Blockchain-Based Film Finance

Raleigh, United States, April sixteenth, 2025, Chainwire Film Finance Group International (FFGI), a North Carolina-based…

April 16, 2025

You Might Also Like

Digital brain as scaling intelligent automation without disruption demands a focus on architectural elasticity, not just deploying more bots.
AI

Scaling intelligent automation without breaking live workflows

By saad
Rowspace Raises $50M to Bring AI for Private Equity Out of the Back Office
AI

Rowspace Raises $50M to Bring AI for Private Equity Out of the Back Office

By saad
Dyna.Ai Just Raised Eight Figures to Fix Finance's Biggest AI Problem
AI

Dyna.Ai Just Raised Eight Figures to Fix Finance’s Biggest AI Problem

By saad
JPMorgan expands AI investment as tech spending nears $20B
AI

JPMorgan expands AI investment as tech spending nears $20B

By saad
Data Center News
Facebook Twitter Youtube Instagram Linkedin

About US

Data Center News: Stay informed on the pulse of data centers. Latest updates, tech trends, and industry insights—all in one place. Elevate your data infrastructure knowledge.

Top Categories
  • Global Market
  • Infrastructure
  • Innovations
  • Investments
Usefull Links
  • Home
  • Contact
  • Privacy Policy
  • Terms & Conditions

© 2024 – datacenternews.tech – All rights reserved

Welcome Back!

Sign in to your account

Lost your password?
We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will assume that you are happy with it.
You can revoke your consent any time using the Revoke consent button.