Wednesday, 18 Mar 2026
Subscribe
logo
  • Global
  • AI
  • Cloud Computing
  • Edge Computing
  • Security
  • Investment
  • Sustainability
  • More
    • Colocation
    • Quantum Computing
    • Regulation & Policy
    • Infrastructure
    • Power & Cooling
    • Design
    • Innovations
    • Blog
Font ResizerAa
Data Center NewsData Center News
Search
  • Global
  • AI
  • Cloud Computing
  • Edge Computing
  • Security
  • Investment
  • Sustainability
  • More
    • Colocation
    • Quantum Computing
    • Regulation & Policy
    • Infrastructure
    • Power & Cooling
    • Design
    • Innovations
    • Blog
Have an existing account? Sign In
Follow US
© 2022 Foxiz News Network. Ruby Design Company. All Rights Reserved.
Data Center News > Blog > Global Market > Nvidia targets inference as AI’s next battleground with Groq 3 LPX
Global Market

Nvidia targets inference as AI’s next battleground with Groq 3 LPX

Last updated: March 18, 2026 4:56 am
Published March 18, 2026
Share
Nvidia high-performance chip technology
SHARE

Contents
The subsequent section of inferencingCoaching versus inferencing

It’s a giant value play, he identified, and it “has to occur in every single place, on a regular basis, for all customers.”

The subsequent section of inferencing

The brand new Groq 3 language processing models (LPUs) are primarily based on mental property (IP) from Groq, which signed a $20 billion licensing settlement with Nvidia late final 12 months. In response to the chip firm, a fleet of LPUs can operate as a “large single processor.”

Whereas Rubin GPUs will proceed to deal with prefill (immediate processing), Groq’s LPX will now deal with latency-sensitive parts of decode (response). Collectively, they will ship a “new class of inference efficiency,” Nvidia says. 

Every LPX rack options 256 LPUs with 128 GB of on-chip static random-access reminiscence (SRAM), 150 terabyte per second (TB/s) bandwidth, chip-to-chip hyperlinks and high-speed connections to NVL72, Nvidia’s liquid-cooled AI supercomputer. Mixed, these can scale back latency to “close to zero,” Nvidia claims.

The LPX integration with Vera Rubin AI factories will probably be accessible within the second half of this 12 months.

Coaching versus inferencing

Coaching and inference stress infrastructure in very alternative ways, famous Sanchit Vir Gogia, chief analyst at Greyhound Analysis. Whereas coaching rewards “huge parallelism and brute-force scale,” inferencing (particularly for lengthy context and interactive reasoning) is much extra delicate to latency, reminiscence motion, cache conduct, concurrency, and cost per delivered token.

Source link

See also  Ori deploys first NVIDIA H200 AI cloud at Kao Data in UK
TAGGED: AIs, battleground, Groq, Inference, LPX, Nvidia, targets
Share This Article
Twitter Email Copy Link Print
Previous Article The UK’s data centre pipeline continues to soar The UK’s data centre pipeline continues to soar
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Your Trusted Source for Accurate and Timely Updates!

Our commitment to accuracy, impartiality, and delivering breaking news as it happens has earned us the trust of a vast audience. Stay ahead with real-time updates on the latest events, trends.
FacebookLike
TwitterFollow
InstagramFollow
YoutubeSubscribe
LinkedInFollow
MediumFollow
- Advertisement -
Ad image

Popular Posts

Blue Origin targets enterprise networks with a multi-terabit satellite connectivity plan

“It’s very best for distant, sparse, or delicate areas,” mentioned Manish Rawat, analyst at TechInsights.…

January 22, 2026

Elektra Health Raises $3.3M in Funding

Elektra Health, a NYC-based digital well being platform that empowers girls navigating the menopause journey…

February 21, 2024

Scaling agentic AI: Inside Atlassian’s culture of experimentation

Scaling agentic AI isn’t nearly having the most recent instruments — it requires clear steerage,…

July 10, 2025

AI Energy Council discusses how power grid will unlock AI benefits

The vitality calls for to drive the processing energy wanted for brand spanking new waves…

June 30, 2025

How To Optimize Your Data Center Against Ransomware Attacks | DCN

Many methods for preventing ransomware, like taking common backups, are the identical irrespective of the…

February 15, 2024

You Might Also Like

The UK’s data centre pipeline continues to soar
Global Market

The UK’s data centre pipeline continues to soar

By saad
shutterstock 676845610 21.12.20 emerging network edge trends to watch out for in 2021 100869154 pos
Global Market

Cato Networks unveils GPU-powered SASE with native AI security controls

By saad
NVIDIA unveils AI computing platform for orbital data centres
Global Market

NVIDIA unveils AI computing platform for orbital data centres

By saad
AMD targets industrial edge AI with new Ryzen embedded chips built for real-time inference
Edge Computing

AMD targets industrial edge AI with new Ryzen embedded chips built for real-time inference

By saad
Data Center News
Facebook Twitter Youtube Instagram Linkedin

About US

Data Center News: Stay informed on the pulse of data centers. Latest updates, tech trends, and industry insights—all in one place. Elevate your data infrastructure knowledge.

Top Categories
  • Global Market
  • Infrastructure
  • Innovations
  • Investments
Usefull Links
  • Home
  • Contact
  • Privacy Policy
  • Terms & Conditions

© 2024 – datacenternews.tech – All rights reserved

Welcome Back!

Sign in to your account

Lost your password?
We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will assume that you are happy with it.
You can revoke your consent any time using the Revoke consent button.