Google has unveiled Ironwood, its seventh-generation AI chip, which the corporate stated is designed to deal with essentially the most demanding AI inference workloads at scale.
At Google Cloud Subsequent 25 yesterday (April 9), Google stated the brand new Ironwood tensor processing unit (TPU) represents a “important shift within the improvement of AI” and the infrastructure that powers its progress.
“Ironwood is our strongest, succesful and energy-efficient TPU but. And it is purpose-built to energy considering, inferential AI fashions at scale,” stated Amin Vahdat, vice chairman and basic supervisor of machine studying at Google’s Techniques and Cloud AI division, in an accompanying blog post.
“For greater than a decade, TPUs have powered Google’s most demanding AI coaching and serving workloads and have enabled our cloud clients to do the identical.”
The Age of Inference
In keeping with Google, Ironwood represents a shift from responsive AI fashions that present real-time data for folks to interpret to fashions that proactively generate insights and interpretation.
“Ironwood is constructed to assist this subsequent part of generative AI and its large computational and communication necessities,” the search large stated.
Certainly one of a number of new parts in Google Cloud AI Hypercomputer structure, Ironwood scales as much as 9,216 liquid-cooled chips linked with Inter-Chip Interconnect (ICI) networking spanning almost 10 MW.
Every chip delivers a peak efficiency of 4,614 teraflops. When scaled to 9,216 chips per pod for 42.5 exaflops, Ironwood is alleged to ship greater than 24 instances the compute energy of the world’s largest supercomputer, El Capitan.
Google Ironwood: Key Options
Key options of Google Ironwood embrace:
-
Vital efficiency positive aspects, with a give attention to effectivity. Ironwood’s efficiency per watt is 2x that of Trillium, the sixth era TPU announced last year.
-
Elevated Excessive Bandwidth Reminiscence (HBM) capability. Ironwood gives 192 GB per chip, 6x that of Trillium.
-
Improved HBM bandwidth, reaching 7.2 TBps per chip. This excessive bandwidth ensures speedy information entry for memory-intensive AI workloads.
“Ironwood represents a novel breakthrough within the age of inference with elevated computation energy, reminiscence capability, ICI networking developments and reliability,” Vahdat stated.
“These breakthroughs, coupled with an almost 2x enchancment in energy effectivity, imply that our most demanding clients can tackle coaching and serving workloads with the very best efficiency and lowest latency, all whereas assembly the exponential rise in computing demand.”
Learn extra of the most recent information heart {hardware} information
The AI Chip Race Heats Up
Google’s Ironwood announcement is the most recent in a string of next-gen chip launches aimed toward powering large-scale AI workloads.
Final month, at GTC 2025, Nvidia CEO Jensen Huang outlined the chip large’s AI imaginative and prescient, unveiling new supercomputers and software program to energy next-gen workloads. These embrace the brand new Blackwell Extremely AI chip and Vera Rubin processors.
In February, Intel expanded its household of Xeon 6 processors with new high-performance chips designed for enterprises with compute-intensive wants, reminiscent of AI, virtualization, and databases.
Microsoft, in the meantime, lately introduced Majorana 1, its first quantum computing chip that’s stated to mark a significant step within the firm’s effort to provide gadgets that may sometime remedy issues past the attain of contemporary computer systems.
