Oracle has unveiled Oracle Cloud Infrastructure (OCI) Zettascale10, a brand new era of AI supercomputing infrastructure that it calls the biggest AI supercomputer within the cloud. The system connects lots of of hundreds of NVIDIA GPUs throughout a number of Oracle knowledge facilities to create multi-gigawatt superclusters able to delivering as much as 16 zettaFLOPS of peak efficiency.
OCI Zettascale10 varieties the computational basis of Stargate, the flagship supercluster inbuilt collaboration with OpenAI in Abilene, Texas. The system represents a serious leap in cloud-based AI efficiency, fusing Oracle’s next-generation Acceleron RoCE networking structure with NVIDIA’s full-stack AI infrastructure. Collectively, the businesses are setting new benchmarks for scale, vitality effectivity, and reliability in distributed AI computing.
In response to Mahesh Thiagarajan, Govt Vice President at Oracle Cloud Infrastructure, Zettascale10 redefines what’s potential for enterprise-scale AI. “With OCI Zettascale10, we’re combining OCI’s groundbreaking Acceleron RoCE community with NVIDIA’s newest AI infrastructure to ship multi-gigawatt AI capability at unmatched scale,” he mentioned. “Prospects can construct, prepare, and deploy their largest AI fashions into manufacturing with much less energy per unit of efficiency and with excessive reliability. They’ll additionally profit from robust knowledge and AI sovereignty controls throughout Oracle’s distributed cloud.”
The system builds on Oracle’s first Zettascale cluster, launched in 2024, however scales dramatically in measurement and efficiency. Every Zettascale10 cluster is housed in a gigawatt-class knowledge middle campus engineered for excessive density inside a two-kilometer radius – an architectural design that minimizes GPU-to-GPU latency, a vital think about large-scale AI mannequin coaching. The Abilene Stargate website serves because the pilot deployment for the brand new structure, providing a real-world testbed for next-generation AI infrastructure.
Peter Hoeschele, Vice President of Infrastructure and Industrial Compute at OpenAI, emphasised the size of the venture. “The OCI Zettascale10 community and cluster material was developed and deployed first at our joint supercluster in Abilene,” he mentioned. “The extremely scalable RoCE design maximizes efficiency at gigawatt scale whereas guaranteeing many of the energy is concentrated on compute. We’re excited to proceed scaling Abilene and the broader Stargate program collectively.”
Delivering AI at Extraordinary Scale
Oracle’s Zettascale10 clusters are designed to deal with workloads of extraordinary complexity, concentrating on deployments of as much as 800,000 NVIDIA GPUs with constant, predictable efficiency. The mix of Oracle’s low-latency RoCEv2 networking and NVIDIA’s AI infrastructure stack permits enterprises to scale from analysis environments to industrialized AI manufacturing with minimal friction and robust price effectivity.
Ian Buck, Vice President of Hyperscale at NVIDIA, mentioned the partnership brings collectively the most effective of each corporations’ applied sciences. “Oracle and NVIDIA are uniting OCI’s distributed cloud and our full-stack AI infrastructure to ship AI at extraordinary scale,” Buck mentioned. “OCI Zettascale10 supplies the compute material wanted to advance state-of-the-art AI analysis and assist organizations transfer from experimentation to production-grade AI.”
Central to Oracle’s new design is the Acceleron RoCE networking system, which leverages the switching capabilities constructed into trendy GPU community interface playing cards (NICs). These GPUs can connect with a number of switches concurrently, every on an unbiased community aircraft. This structure boosts scalability and reliability by routinely rerouting site visitors when community planes expertise congestion or failure – eliminating pricey job restarts throughout AI coaching.
The networking system contains a number of key improvements designed to assist hyperscale AI workloads. Its broad, shallow material permits prospects to construct bigger clusters sooner at decrease price by utilizing GPUs as mini-switches that join throughout a number of remoted planes, chopping energy use and infrastructure tiers. Enhanced reliability prevents job interruptions by isolating knowledge flows and dynamically shifting site visitors away from unstable areas. Oracle’s use of Linear Pluggable Optics (LPO) and Linear Receiver Optics (LRO) additionally reduces community and cooling vitality consumption with out compromising high-throughput connectivity at 400G or 800G speeds.
Zettascale10 is now open for preorders and is anticipated to be out there within the second half of subsequent 12 months. Oracle plans to make multi-gigawatt-scale deployments accessible to prospects throughout its distributed cloud areas, enabling organizations to coach and deploy large AI fashions whereas sustaining regional knowledge management.
With the introduction of OCI Zettascale10, Oracle is positioning itself as a direct competitor within the race to construct probably the most highly effective cloud AI infrastructure – a race presently dominated by hyperscalers like Amazon, Microsoft, and Google. The mix of Oracle’s proprietary networking, NVIDIA’s AI compute platforms, and OpenAI’s early deployment alerts a serious shift towards industrial-scale AI computing, the place effectivity, sustainability, and sovereignty are as necessary as uncooked energy.
