Saturday, 28 Feb 2026
Subscribe
logo
  • Global
  • AI
  • Cloud Computing
  • Edge Computing
  • Security
  • Investment
  • Sustainability
  • More
    • Colocation
    • Quantum Computing
    • Regulation & Policy
    • Infrastructure
    • Power & Cooling
    • Design
    • Innovations
    • Blog
Font ResizerAa
Data Center NewsData Center News
Search
  • Global
  • AI
  • Cloud Computing
  • Edge Computing
  • Security
  • Investment
  • Sustainability
  • More
    • Colocation
    • Quantum Computing
    • Regulation & Policy
    • Infrastructure
    • Power & Cooling
    • Design
    • Innovations
    • Blog
Have an existing account? Sign In
Follow US
© 2022 Foxiz News Network. Ruby Design Company. All Rights Reserved.
Data Center News > Blog > AI > Why the AI era is forcing a redesign of the entire compute backbone
AI

Why the AI era is forcing a redesign of the entire compute backbone

Last updated: August 3, 2025 7:48 pm
Published August 3, 2025
Share
Why the AI era is forcing a redesign of the entire compute backbone
SHARE

Need smarter insights in your inbox? Join our weekly newsletters to get solely what issues to enterprise AI, knowledge, and safety leaders. Subscribe Now


The previous few many years have seen virtually unimaginable advances in compute efficiency and effectivity, enabled by Moore’s Regulation and underpinned by scale-out commodity {hardware} and loosely coupled software program. This structure has delivered on-line companies to billions globally and put nearly all of human information at our fingertips.

However the subsequent computing revolution will demand way more. Fulfilling the promise of AI requires a step-change in capabilities far exceeding the developments of the web period. To realize this, we as an business should revisit among the foundations that drove the earlier transformation and innovate collectively to rethink the whole know-how stack. Let’s discover the forces driving this upheaval and lay out what this structure should appear like.

From commodity {hardware} to specialised compute

For many years, the dominant pattern in computing has been the democratization of compute by means of scale-out architectures constructed on practically an identical, commodity servers. This uniformity allowed for versatile workload placement and environment friendly useful resource utilization. The calls for of gen AI, closely reliant on predictable mathematical operations on huge datasets, are reversing this pattern. 

We at the moment are witnessing a decisive shift in the direction of specialised {hardware} — together with ASICs, GPUs, and tensor processing models (TPUs) — that ship orders of magnitude enhancements in efficiency per greenback and per watt in comparison with general-purpose CPUs. This proliferation of domain-specific compute models, optimized for narrower duties, might be essential to driving the continued speedy advances in AI.


The AI Impression Sequence Returns to San Francisco – August 5

The following section of AI is right here – are you prepared? Be a part of leaders from Block, GSK, and SAP for an unique have a look at how autonomous brokers are reshaping enterprise workflows – from real-time decision-making to end-to-end automation.

Safe your spot now – house is restricted: https://bit.ly/3GuuPLF


Past ethernet: The rise of specialised interconnects

These specialised methods will usually require “all-to-all” communication, with terabit-per-second bandwidth and nanosecond latencies that strategy native reminiscence speeds. As we speak’s networks, largely based mostly on commodity Ethernet switches and TCP/IP protocols, are ill-equipped to deal with these excessive calls for. 

Consequently, to scale gen AI workloads throughout huge clusters of specialised accelerators, we’re seeing the rise of specialised interconnects, similar to ICI for TPUs and NVLink for GPUs. These purpose-built networks prioritize direct memory-to-memory transfers and use devoted {hardware} to hurry data sharing amongst processors, successfully bypassing the overhead of conventional, layered networking stacks. 

See also  OpenAI unveils Responses API, open source Agents SDK, letting developers build their own Deep Research and Operator

This transfer in the direction of tightly built-in, compute-centric networking might be important to overcoming communication bottlenecks and scaling the subsequent era of AI effectively.

Breaking the reminiscence wall

For many years, the efficiency good points in computation have outpaced the expansion in reminiscence bandwidth. Whereas strategies like caching and stacked SRAM have partially mitigated this, the data-intensive nature of AI is simply exacerbating the issue. 

The insatiable have to feed more and more highly effective compute models has led to excessive bandwidth reminiscence (HBM), which stacks DRAM immediately on the processor package deal to spice up bandwidth and cut back latency. Nonetheless, even HBM faces elementary limitations: The bodily chip perimeter restricts complete dataflow, and shifting huge datasets at terabit speeds creates vital power constraints.  

These limitations spotlight the essential want for higher-bandwidth connectivity and underscore the urgency for breakthroughs in processing and reminiscence structure. With out these improvements, our highly effective compute sources will sit idle ready for knowledge, dramatically limiting effectivity and scale.

From server farms to high-density methods

As we speak’s superior machine studying (ML) fashions usually depend on fastidiously orchestrated calculations throughout tens to lots of of 1000’s of an identical compute components, consuming immense energy. This tight coupling and fine-grained synchronization on the microsecond stage imposes new calls for. In contrast to methods that embrace heterogeneity, ML computations require homogeneous components; mixing generations would bottleneck sooner models. Communication pathways should even be pre-planned and extremely environment friendly, since delays in a single component can stall a complete course of.

These excessive calls for for coordination and energy are driving the necessity for unprecedented compute density. Minimizing the bodily distance between processors turns into important to cut back latency and energy consumption, paving the way in which for a brand new class of ultra-dense AI methods.

This drive for excessive density and tightly coordinated computation basically alters the optimum design for infrastructure, demanding a radical rethinking of bodily layouts and dynamic energy administration to stop efficiency bottlenecks and maximize effectivity.

A brand new strategy to fault tolerance

Conventional fault tolerance depends on redundancy amongst loosely linked methods to attain excessive uptime. ML computing calls for a distinct strategy. 

First, the sheer scale of computation makes over-provisioning too pricey. Second, mannequin coaching is a tightly synchronized course of, the place a single failure can cascade to 1000’s of processors. Lastly, superior ML {hardware} usually pushes to the boundary of present know-how, doubtlessly resulting in larger failure charges.

See also  Is your cabling strategy ready for the AI era?

As an alternative, the rising technique includes frequent checkpointing — saving computation state — coupled with real-time monitoring, speedy allocation of spare sources and fast restarts. The underlying {hardware} and community design should allow swift failure detection and seamless element alternative to take care of efficiency.

A extra sustainable strategy to energy

As we speak and searching ahead, entry to energy is a key bottleneck for scaling AI compute. Whereas conventional system design focuses on most efficiency per chip, we should shift to an end-to-end design targeted on delivered, at-scale efficiency per watt. This strategy is important as a result of it considers all system parts — compute, community, reminiscence, energy supply, cooling and fault tolerance — working collectively seamlessly to maintain efficiency. Optimizing parts in isolation severely limits total system effectivity.

As we push for better efficiency, particular person chips require extra energy, usually exceeding the cooling capability of conventional air-cooled knowledge facilities. This necessitates a shift in the direction of extra energy-intensive, however in the end extra environment friendly, liquid cooling options, and a elementary redesign of information heart cooling infrastructure. 

Past cooling, typical redundant energy sources, like twin utility feeds and diesel mills, create substantial monetary prices and gradual capability supply. As an alternative, we should mix numerous energy sources and storage at multi-gigawatt scale, managed by real-time microgrid controllers. By leveraging AI workload flexibility and geographic distribution, we will ship extra functionality with out costly backup methods wanted only some hours per yr. 

This evolving energy mannequin permits real-time response to energy availability — from shutting down computations throughout shortages to superior strategies like frequency scaling for workloads that may tolerate decreased efficiency. All of this requires real-time telemetry and actuation at ranges not at the moment out there.

Safety and privateness: Baked in, not bolted on

A essential lesson from the web period is that safety and privateness can’t be successfully bolted onto an present structure. Threats from unhealthy actors will solely develop extra refined, requiring protections for person knowledge and proprietary mental property to be constructed into the material of the ML infrastructure. One essential remark is that AI will, ultimately, improve attacker capabilities. This, in flip, signifies that we should be sure that AI concurrently supercharges our defenses.

This consists of end-to-end knowledge encryption, strong knowledge lineage monitoring with verifiable entry logs, hardware-enforced safety boundaries to guard delicate computations and complicated key administration methods. Integrating these safeguards from the bottom up might be important for shielding customers and sustaining their belief. Actual-time monitoring of what is going to seemingly be petabits/sec of telemetry and logging might be key to figuring out and neutralizing needle-in-the-haystack assault vectors, together with these coming from insider threats.

See also  Premio adds two new products to its roster for faster processing and compute performance

Velocity as a strategic crucial

The rhythm of {hardware} upgrades has shifted dramatically. In contrast to the incremental rack-by-rack evolution of conventional infrastructure, deploying ML supercomputers requires a basically completely different strategy. It’s because ML compute doesn’t simply run on heterogeneous deployments; the compute code, algorithms and compiler have to be particularly tuned to every new {hardware} era to completely leverage its capabilities. The speed of innovation can be unprecedented, usually delivering an element of two or extra in efficiency yr over yr from new {hardware}. 

Due to this fact, as an alternative of incremental upgrades, a large and simultaneous rollout of homogeneous {hardware}, usually throughout complete knowledge facilities, is now required. With annual {hardware} refreshes delivering integer-factor efficiency enhancements, the power to quickly get up these colossal AI engines is paramount.

The aim have to be to compress timelines from design to completely operational 100,000-plus chip deployments, enabling effectivity enhancements whereas supporting algorithmic breakthroughs. This necessitates radical acceleration and automation of each stage, demanding a manufacturing-like mannequin for these infrastructures. From structure to monitoring and restore, each step have to be streamlined and automatic to leverage every {hardware} era at unprecedented scale.

Assembly the second: A collective effort for next-gen AI infrastructure

The rise of gen AI marks not simply an evolution, however a revolution that requires a radical reimagining of our computing infrastructure. The challenges forward — in specialised {hardware}, interconnected networks and sustainable operations — are vital, however so too is the transformative potential of the AI it should allow. 

It’s straightforward to see that our ensuing compute infrastructure might be unrecognizable within the few years forward, which means that we can not merely enhance on the blueprints we have now already designed. As an alternative, we should collectively, from analysis to business, embark on an effort to re-examine the necessities of AI compute from first ideas, constructing a brand new blueprint for the underlying world infrastructure. This in flip will lead to basically new capabilities, from drugs to schooling to enterprise, at unprecedented scale and effectivity.

Amin Vahdat is VP and GM for machine studying, methods and cloud AI at Google Cloud.


Source link
TAGGED: Backbone, compute, entire, Era, forcing, redesign
Share This Article
Twitter Email Copy Link Print
Previous Article Slice Raises $7M in Seed Funding Zettabyte Receives Strategic Investment from Lam Capital 
Next Article BioConsortia BioConsortia Raises $15M in Funding
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Your Trusted Source for Accurate and Timely Updates!

Our commitment to accuracy, impartiality, and delivering breaking news as it happens has earned us the trust of a vast audience. Stay ahead with real-time updates on the latest events, trends.
FacebookLike
TwitterFollow
InstagramFollow
YoutubeSubscribe
LinkedInFollow
MediumFollow
- Advertisement -
Ad image

Popular Posts

In the Spotlight… Pulsant Video Interview

Our sponsored video interview collection, In The Highlight, is again, and this time we’re joined…

March 6, 2025

Making the most of data centre waste heat 

Shahid Rahman, EMEA – Knowledge Centre Strategic Account Lead (Engineered IT Cooling Options) at Mitsubishi…

June 5, 2024

Sentient AI Secures $1.5M Raise, Prepares AI Agent Launchpad on Sui

Panama, Republic of Panama, December twenty seventh, 2024, Chainwire Sentient AI, incubated by GameFi.org and…

December 28, 2024

CENTIEL launches rapid deployment, containerised UPS Hire Service

Andrew Skelton, Operations Director, Centiel explains: “We provide versatile, fast deployment of our trade main…

April 23, 2024

American Tower expands reach with debut edge data center in North Carolina

The brand new American Tower edge information heart in Raleigh, North Carolina is at the…

June 12, 2024

You Might Also Like

ASML's high-NA EUV tools clear the runway for next-gen AI chips
AI

ASML’s high-NA EUV tools clear the runway for next-gen AI chips

By saad
Poor implementation of AI may be behind workforce reduction
AI

Poor implementation of AI may be behind workforce reduction

By saad
Upgrading agentic AI for finance workflows
AI

Upgrading agentic AI for finance workflows

By saad
Goldman Sachs and Deutsche Bank test agentic AI for trade surveillance
AI

Goldman Sachs and Deutsche Bank test agentic AI in trading

By saad
Data Center News
Facebook Twitter Youtube Instagram Linkedin

About US

Data Center News: Stay informed on the pulse of data centers. Latest updates, tech trends, and industry insights—all in one place. Elevate your data infrastructure knowledge.

Top Categories
  • Global Market
  • Infrastructure
  • Innovations
  • Investments
Usefull Links
  • Home
  • Contact
  • Privacy Policy
  • Terms & Conditions

© 2024 – datacenternews.tech – All rights reserved

Welcome Back!

Sign in to your account

Lost your password?
We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will assume that you are happy with it.
You can revoke your consent any time using the Revoke consent button.