Rubin has two dies with 25 petaFLOPs per die, NVLink interconnect and 288GB of HBM4 high-speed reminiscence. The Rubin CPX has one die with 30 petaFLOPS of efficiency, no NVLink and 128GB of GDDR7 reminiscence. So Rubin CPX is perfect for particular excessive context wants that don’t want lots of reminiscence. CPX will probably be cheaper than the usual Rubin however Nvidia wouldn’t say how a lot.

To course of video, AI fashions can take as much as a million tokens for an hour of content material, which may take many hours if not days to generate. The extra tokens the system can generate, the bigger scale processing it could possibly do.

Rubin CPX delivers as much as 30 petaflops of compute with NVFP4 precision. It options 128GB of GDDR7 reminiscence somewhat than the same old HBM reminiscence, which is costlier than GDDR7. Nvidia says that the GDDR7 has sufficient efficiency, and that Rubin CPX delivers thrice quicker consideration capabilities in contrast with GB300 NVL72 techniques.

Rubin CPX is obtainable in a number of configurations, together with the Vera Rubin NVL144 CPX, that may be mixed with the Quantum‐X800 InfiniBand scale-out compute cloth or the Spectrum-XTM Ethernet networking platform with Nvidia Spectrum-XGS Ethernet expertise and Nvidia ConnectX-9 SuperNICs.

Nvidia can also be saying a brand new Vera Rubin NVL 144 CPX rack. Narasimhan mentioned the NVL 144 CPX permits AI service suppliers to dramatically enhance their profitability by delivering $5 billion of income for each $100 million invested in infrastructure.

It is available in two configurations: single rack, with 144 Rubin CPX GPUs, 144 Rubin GPUs, and 36 Vera CPUs for 8 exaFLOPs of NVFP4 compute and 100TB of quick reminiscence and 1.7 PB/s of reminiscence bandwidth. Nvidia mentioned it’s 7.5 occasions quicker than the present top-of-the-line GB300 NVL72.

Source link

Nvidia rolls out new GPUs for AI inferencing, large workloads

Leave a Reply Cancel reply

Your Trusted Source for Accurate and Timely Updates!

Popular Posts

HPEJ, JHPE or what? | Network World

Baidu undercuts rival AI models with ERNIE 4.5 and ERNIE X1

OpenAI data residency advances enterprise AI governance

AI Computing Is on Pace to Consume More Energy Than India, Arm Says | DCN

New Data Center Developments: February 2025

About US

Top Categories

Usefull Links