Oracle and AMD have introduced a serious partnership geared toward delivering large-scale, high-performance AI infrastructure on Oracle Cloud Infrastructure (OCI). Central to this collaboration is the provision of AMD’s newest Intuition MI355X GPUs, designed to double the price-performance ratio for AI coaching and inference workloads in comparison with the earlier technology.
OCI will host zettascale AI clusters powered by as many as 131,072 MI355X GPUs, making it probably the most bold AI cloud choices thus far. The transfer targets a rising demand from enterprises creating complicated AI fashions, together with massive language fashions and rising agentic AI purposes.
Mahesh Thiagarajan, Government Vice President of Oracle Cloud Infrastructure, emphasised Oracle’s deal with increasing its AI infrastructure to fulfill the wants of shoppers working high-intensity AI workloads. “AMD Intuition GPUs, paired with OCI’s efficiency, superior networking, flexibility, safety, and scale, will assist our clients meet their inference and coaching wants.”
The brand new providing is backed by AMD’s newest architectural developments. The Intuition MI355X GPUs are constructed to ship almost thrice the computing efficiency of their predecessors, with a 50% enchancment in high-bandwidth reminiscence capability. This allows organizations to coach and deploy bigger fashions extra effectively and with decreased latency.
Forrest Norrod, EVP and GM of AMD’s Information Heart Options Enterprise Group, famous the long-standing alignment between Oracle and AMD in supporting open, high-efficiency methods. “The MI355X, mixed with OCI’s infrastructure and AMD’s Pollara NICs, offers the size and suppleness wanted to energy the subsequent wave of AI innovation,” Norrod stated.
The collaboration introduces a number of technical improvements geared toward high-volume AI deployment. AMD’s MI355X shapes on OCI will help as much as 288GB of HBM3 reminiscence and 8TB/s bandwidth, permitting massive fashions to be absolutely loaded into reminiscence – dramatically enhancing efficiency for coaching and inference duties. The brand new FP4 floating-point format additionally permits cost-efficient execution of generative AI workloads.
To deal with efficiency density, the brand new infrastructure employs a dense, liquid-cooled design, supporting 64 GPUs per rack at 1,400 watts every, delivering as much as 125kW per rack. This setup is engineered for each excessive throughput and decrease time-to-first-token (TTFT), important for real-time AI purposes.
AMD Turin CPUs
OCI’s AI platform would additionally profit from a high-performance head node powered by AMD Turin CPUs, providing as much as 3TB of system reminiscence to reinforce orchestration and information dealing with. These nodes act because the central coordination level for GPU assets, making certain optimum utilization throughout large-scale deployments.
Crucially, the partnership continues Oracle’s dedication to open-source software program. AMD’s ROCm software program stack permits builders emigrate current AI code seamlessly to OCI’s infrastructure, avoiding vendor lock-in. ROCm helps broadly adopted AI frameworks, compilers, and libraries, enabling sooner improvement cycles and broader accessibility.
Community structure additionally receives a major enhance by AMD’s Pollara AI NICs, which help superior RoCE (RDMA over Converged Ethernet) capabilities. As the primary cloud supplier to combine these NICs, Oracle beneficial properties a aggressive edge in lowering community latency and growing throughput for hyperscale AI workloads. Assist for the Extremely Ethernet Consortium’s open trade requirements ensures interoperability with future networking improvements.
The launch of MI355X GPUs on OCI is anticipated within the fall of 2025, positioning Oracle and AMD to fulfill accelerating demand for AI infrastructure. As AI adoption scales quickly throughout industries, this collaboration units a brand new benchmark in cloud-based, high-performance computing options.
