Rubin GPUs can ship 50 petaflops for inference utilizing NVFP4 knowledge format—5 instances sooner than Blackwell—and hit 35 petaflops for NVFP4 coaching, which is 3.5 instances sooner than Blackwell. HBM4 reminiscence provides 22 Tbps bandwidth — 2.8x over Blackwell — and NVLink bandwidth per GPU is 3.6 Tbps, double Blackwell’s velocity.
Networking is enhanced with the liquid-cooled NVLink 6 Swap, providing 400G SerDes, 3.6 Tbps per-GPU bandwidth, complete switching bandwidth of 28.8 Tbps, and 14.4 teraflops of FP8 in-network compute functionality.
The entire platform provides the Vera Rubin NVL72 platform as much as 3.6 exaflops of NVFP4 inference, which is 5 instances sooner than the earlier era platform, and as much as 2.5 exaflops of NVFP4 coaching, 3.5 instances greater than the earlier era.
Vera Rubin NVL72 consists of 54 TB of LPDDR5x capability (2.5x Blackwell), 20.7 TB HBM4 (50% extra), 1.6 Pbps HBM4 bandwidth (2.8x improve), and a scale-up bandwidth of 260 Tbps (double that of Blackwell NVL72). “That’s extra bandwidth than the complete international Web,” stated Harris.
Nvidia additionally redesigned the rack, asserting its Third-Gen NVL72 Rack Resiliency. Options embody a cable-free modular tray design that allows meeting and servicing 18 instances sooner than the earlier era.
The NVLink Clever Resiliency characteristic helps server upkeep with “zero downtime,” holding racks operational even throughout element swaps or partial inhabitants. The second-generation RAS Engine permits for GPU diagnostics with out taking the rack offline.
