Robert van der Kolk, President of EMEA and APAC at nVent, explains why the subsequent section of AI infrastructure will rely upon rethinking energy supply, cooling and the way in which the 2 work collectively.
As AI utilization expands, the {hardware} that helps it’s turning into as essential to the AI revolution as the big language fashions themselves. Whereas there are totally different varieties of information centres, from large hyperscale initiatives to edge installations, all information centre operators are targeted on the identical issues: deploying AI rapidly, delivering and utilizing energy effectively, and future-proofing IT infrastructure so it’s ready to assist the wants of AI chips, graphics processing items (GPUs) and tensor processing items.
Energy and cooling necessities
The dimensions of technological change occurring in information centres is immense. For context, estimates put common pre-AI, non-high-performance computing information centre rack energy at about 8 kW per rack. Immediately, the business is pushing in direction of one- to three-megawatt racks, representing a considerable enhance in energy demand.
As AI functions are deployed throughout most industries, supporting AI know-how requires a major shift in infrastructure considering.
Cooling reference architectures
Given these energy calls for, liquid cooling is turning into more and more necessary for AI information centres, as air-based cooling know-how alone is usually inadequate to handle GPUs successfully. To implement appropriate options efficiently, the business wants a versatile and collaborative framework for creating AI cooling infrastructure requirements. Transferring in direction of widespread infrastructure requirements and a extra interoperable mannequin throughout the information centre business might assist firms speed up improvement whereas persevering with to innovate in differentiated methods.
Modularity and normal interfaces enable information centres to deploy applied sciences quicker, offering a basis upon which infrastructure suppliers can innovate for effectivity and efficiency. Reference architectures can nonetheless enable for distinctive and differentiated designs in coolant distribution items, rear-door coolers, warmth rejection items and manifolds, and know-how cooling techniques, whereas additionally offering compatibility and interface standardisation that helps the blending and matching of merchandise from a number of suppliers.
The liquid cooling structure utilized in at the moment’s high-density racks is usually a closed-loop system that makes use of a handled fluid that’s constantly recirculated. By leveraging conduction over convection, direct-to-chip cooling can stop evaporative loss and enhance effectivity in contrast with air cooling. This structure not solely helps thermal administration however also can present high-grade waste warmth, creating alternatives for warmth reuse functions throughout the information centre.
Excessive-voltage DC energy
For megawatt-scale rack energy supply, the business is shifting in direction of 800-volt direct present (VDC) energy distribution and, doubtlessly, 1500 VDC in the long run.
This shift in energy supply structure can provide a number of advantages, together with decreasing copper utilization and minimising resistive losses. Bringing DC energy all the way in which to racks additionally reduces the variety of AC/DC conversions in an information centre. This will make set up simpler and decrease prices for information centre operators, whereas additionally decreasing the ability losses that happen at any time when energy is transformed from one type to a different.
Transferring to DC energy also can simplify complicated information centre development initiatives as a result of electrical energy doesn’t must be transformed and reconverted a number of instances on its strategy to IT racks. This shift can also improve the scalability of information centres as a result of energy distribution infrastructure doesn’t must be redesigned when IT infrastructure is added, solely expanded to incorporate extra racks.
Nonetheless, information centre operators want to grasp that merchandise comparable to AC-to-DC converters, busbars and busways, and rack energy supply, safety and monitoring options will must be redesigned from how they give the impression of being at the moment to suit right into a DC energy structure.
Energy and cooling convergence
Cooling and energy infrastructure in information centres have to work in tandem. To realize most output effectivity, GPUs want the correct amount of energy, and the ensuing warmth vitality must be appropriately dissipated by cooling infrastructure to maintain these chips inside their thermal working parameters.
Optimising cooling and energy collectively is an space the place information centre operators can enhance each effectivity and efficiency. Given the rise in rack energy and the ensuing transients, a connective framework, superior management algorithms and a software program administration layer between the IT tools and the ability and cooling infrastructure are more likely to grow to be more and more necessary for managing infrastructure effectively and safely.
AI’s subsequent section will likely be formed largely by the bodily layer. Megawatt-class racks are more likely to require liquid cooling constructed on interoperable reference architectures, alongside high-voltage DC distribution to assist cut back losses, supplies use and complexity. The chance now could be to carry collectively energy, cooling and software program controls in a method that helps environment friendly, protected and predictable scale.
