Nvidia on Monday (March 18) introduced its next-generation Blackwell structure GPUs, new supercomputers, and new software program that may make it quicker and simpler for enterprises to construct and run generative AI and different energy-intensive purposes.
The brand new household of Blackwell GPUs provides 20 petaflop of AI efficiency on a single GPU and can permit organizations to coach AI fashions 4 instances quicker, enhance AI inferencing efficiency by 30 instances, and do it with as much as 25 instances higher vitality effectivity than Nvidia’s earlier era Hopper structure chips, stated Ian King, Nvidia’s vp of hyperscale and HPC, in a media briefing.
GTC Product Showcase
Nvidia CEO Jensen Huang unveiled Blackwell and different improvements throughout a keynote speech on Monday to kick off Nvidia’s 2024 GTC developer convention in San Jose, California.
“Generative AI is the defining know-how of our time. Blackwell GPUs are the engine to energy this new industrial revolution,” Huang stated in a press release.
New Blackwell GPUs embrace the Nvidia GB200 Grace Blackwell Superchip, which connects two new Nvidia Blackwell-based B200 GPUs and the present Nvidia Grace CPU to ship 40 petaflops of AI efficiency.
Throughout his speech, Huang additionally introduced new {hardware}: the Nvidia GB200 NVL72, a liquid-cooled, rack-scale server system that options 36 Grace Blackwell Superchips and serves as the inspiration of forthcoming new Nvidia SuperPOD AI Supercomputers that may allow giant, trillion-parameter-scale AI fashions.
Amazon Internet Companies, Google Cloud and Oracle Cloud Infrastructure will make GB200 NVL72 cases accessible on the Nvidia DGX Cloud AI supercomputer cloud service later this yr, the corporate stated.
Nvidia additionally launched a brand new model of its AI software program, AI Enterprise 5.0, which options new Nvidia ‘NIM’ microservices – a set of pre-built containers, commonplace APIs, domain-specific code, and optimized inference engines that make it a lot quicker and simpler for enterprises to develop AI-powered enterprise purposes and run AI fashions within the cloud, information facilities and even GPU-accelerated workstations.
NIM can lower deployment instances from weeks to minutes, the corporate stated.
“It’s the runtime it’s good to run your mannequin in probably the most optimized, enterprise-grade, safe, suitable approach, so you’ll be able to simply construct your enterprise software,” stated Manuvir Das, Nvidia’s vp of enterprise computing, throughout a media briefing earlier than the keynote.
Nvidia Seeks to Enhance its AI Market Share Lead
Analysts say Nvidia’s newest bulletins are important, as the corporate goals to additional bolster its management place within the profitable AI chip market.
GPUs and quicker {hardware} are in excessive demand as enterprises, cloud suppliers, and different information middle operators race so as to add extra information middle capability to energy AI and high-performance computing (HPC) workloads, comparable to computer-aided drug design and digital design automation.
“These bulletins shore up Nvidia’s management place, and it reveals the corporate continues to innovate,” Jim McGregor, founder and principal analyst at Tirias Analysis, informed DCN.
Matt Kimball, vp and principal analyst at Moor Insights & Technique, agrees, saying Nvidia continues to enhance upon its AI ecosystem that spans every thing from chips and {hardware} to frameworks and software program. “The larger story is that Nvidia drives the AI journey, from a software program, tooling, and {hardware} perspective,” he stated.
Intense Competitors
Nvidia dominates the AI market with its GPUs capturing about 85% of the AI chip market, whereas different chipmakers – comparable to AMD, Intel, Google, AWS, Microsoft, Cerebras, and others – have captured 15%, in line with Karl Freund, founder and principal analyst of Cambrian AI Analysis.
Whereas startup Cerebras competes with Nvidia on the very excessive finish of the market with its newest wafer-sized chip, AMD has develop into a big competitor within the GPU area with the December launch of its AMD Intuition MI300 GPUs, McGregor stated.
Intel, for its half, presently has two AI chip choices with its Intel Knowledge Heart GPU Max Sequence and Gaudi accelerators. Intel plans to consolidate its AI choices right into a single AI chip in 2025, McGregor stated.
In its December launch, AMD claimed that its MI300 GPUs have been quicker than Nvidia’s H100 and forthcoming H200 GPU, which is predicted in the course of the second quarter of this yr. However Blackwell will permit Nvidia to leapfrog AMD in efficiency, McGregor stated. How a lot quicker received’t be decided till benchmarks are launched, Freund added.
However the Blackwell GPUs are set to offer an enormous step ahead in efficiency, Kimball stated.
“This isn’t an incremental enchancment, it’s leaps ahead,” he stated. “A 400% higher coaching efficiency over Nvidia’s present era of chips is completely unimaginable.”
Kimball expects AMD will reply again, nevertheless. “It’s a GPU arms race,” he stated. “AMD has a really aggressive providing with the MI300, and I might absolutely count on AMD will come out with one other aggressive providing right here earlier than too lengthy.”
Extra Blackwell GPU Particulars
The Blackwell structure GPUs will allow AI coaching and real-time inferencing of fashions that scale as much as 10 trillion parameters, Nvidia executives stated.
At the moment, there are only a few fashions that attain the trillions of parameters, however that may start to extend as fashions get bigger and because the business begins transferring towards video fashions, McGregor of Tirias Analysis famous.
“The purpose is with generative AI, you want a lot of processing, tons of reminiscence bandwidth and it’s good to put it in a really environment friendly system,” McGregor stated. “For Nvidia, it’s about persevering with to scale as much as meet the wants of those giant language fashions.”
In line with an Nvidia spokesperson, the Blackwell structure GPUs will are available in three configurations:
- The HGX B200, which is able to ship as much as 144 petaflops of AI efficiency in an eight-GPU configuration. This helps x86-based generative AI platforms and helps networking speeds of as much as 400 GB/s by way of Nvidia’s newly introduced Quantum-X800 InfiniBand and Spectrum-X800 Ethernet networking switches.
- The HGX B100, which is able to ship as much as 112 petaflops of AI efficiency in an eight-GPU baseboard and can be utilized to improve current information middle infrastructure operating Nvidia Hopper methods.
- The GB200 Grace Blackwell Superchip, which delivers 40 petaflops of AI efficiency and 900 GB/s of bidirectional bandwidth and may scale to as much as 576 GPUs.
Nvidia stated the Blackwell structure options six revolutionary applied sciences that lead to improved efficiency and higher vitality effectivity. That features a fifth-generation NVLink, which delivers 1.8 TB/s bidirectional throughput per GPU and a second-generation transformer engine that may speed up coaching and inferencing, the corporate stated.
NVLink connects GPUs to GPUs, so the programmer sees one built-in GPU, whereas the second-generation transformer engine permits for extra environment friendly processing by way of 4-bit floating level format, Freund stated.
“It’s all a part of the optimization story. It’s all about getting work accomplished quicker,” he stated.
Blackwell Availability
Pricing of Blackwell GPUs has not been introduced. Nvidia expects to start out releasing Blackwell-based merchandise from companions later this yr.
Nvidia on Monday additionally introduced two new DGX SuperPOD AI Supercomputers – the DGX GB200 and DGX B200 methods – which will probably be accessible from companions later this yr, the corporate stated.
Cisco, Dell Applied sciences, Hewlett Packard Enterprise, Lenovo, and Supermicro are among the many {hardware} makers that plan to ship servers based mostly on Blackwell GPUs, Nvidia stated.