Sunday, 14 Dec 2025
Subscribe
logo
  • Global
  • AI
  • Cloud Computing
  • Edge Computing
  • Security
  • Investment
  • Sustainability
  • More
    • Colocation
    • Quantum Computing
    • Regulation & Policy
    • Infrastructure
    • Power & Cooling
    • Design
    • Innovations
    • Blog
Font ResizerAa
Data Center NewsData Center News
Search
  • Global
  • AI
  • Cloud Computing
  • Edge Computing
  • Security
  • Investment
  • Sustainability
  • More
    • Colocation
    • Quantum Computing
    • Regulation & Policy
    • Infrastructure
    • Power & Cooling
    • Design
    • Innovations
    • Blog
Have an existing account? Sign In
Follow US
© 2022 Foxiz News Network. Ruby Design Company. All Rights Reserved.
Data Center News > Blog > Cloud Computing > What Microsoft’s custom silicon means for Azure
Cloud Computing

What Microsoft’s custom silicon means for Azure

Last updated: January 23, 2024 8:26 am
Published January 23, 2024
Share
ai artificial intelligence circuit board circuitry mother board nodes computer chips
SHARE

The history of modern software development has been a dance between what hardware can give and what software demands. Over the decades, the steps in this dance have moved us from the original Intel 8086, which we now consider very basic functionality, to today’s multi-faceted processors, which provide virtualization support, end-to-end access to encrypted memory and data, and extended instruction sets that power the most demanding application stacks.

This dance swings from side to side. Sometimes our software has to stretch to meet the capabilities of a new generation of silicon, and sometimes it has to squeeze out every last ounce of available performance. Now, we’re finally seeing the arrival of a new generation of hardware that mixes familiar CPUs with new system-level accelerators that provide the ability to run complex AI models on both client hardware and servers, both on premises and in the public cloud.

You’ll find AI accelerators not only in the familiar Intel and AMD processors but also in Arm’s latest generation of Neoverse server-grade designs, which mix those features with low power demands (as do Qualcomm’s mobile and laptop offerings). It’s an attractive combination of features for hyperscale clouds like Azure, where low power and high density can help keep costs down while allowing growth to continue.

At the same time, system-level accelerators promise an interesting future for Windows, allowing us to use on-board AI assistants as an alternative to the cloud as Microsoft continues to improve the performance of its Phi series of small language models.

Azure Boost: Silicon for virtualization offload

Ignite 2023 saw Microsoft announce its own custom silicon for Azure, hardware that should start rolling out to customers in 2024. Microsoft has been using custom silicon and FPGAs in its own services for some time now. The use of Zipline hardware compression and Project Brainwave FPGA-based AI accelerators are good examples. The most recent arrival is Azure Boost, which offloads virtualization processes from the hypervisor and host OS to accelerate storage and networking for Azure VMs. Azure Boost also includes the Cerberus on-board supply chain security chipset.

See also  Tech Mahindra teams up with Microsoft to transform workplaces using generative AI

Azure Boost is intended to give your virtual machine workloads access to as much of the available CPU as possible. Instead of using CPU to compress data or manage security, dedicated hardware takes over, allowing Azure to run more customer workloads on the same hardware. Running systems at high utilization is key to the economics of the public cloud, and any investment in hardware will quickly be paid off.

Maia 100: Silicon for large language models

Large language models (and generative AI generally) show the importance of dense compute, with OpenAI using Microsoft’s GPU-based supercomputer to train its GPT models. Even on a system like Microsoft’s, big foundation models like GPT-4 require months of training, with more than a trillion parameters. The next generation of LLMs will need even more compute, both for training and for operation. If we’re building grounded applications around those LLMs, using Retrieval Augmented Generation, we’ll need additional capacity to create embeddings for our source content and to provide the underlying vector-based search.

GPU-based supercomputers are a significant investment, even when Microsoft can recoup some of the capital costs from subscribers. Operational costs are also large, with hefty cooling requirements on top of power, bandwidth, and storage. So, we might expect those resources to be limited to very few data centers, where there’s sufficient space, power, and cooling.

But if large-scale AI is to be a successful differentiator for Azure, versus competitors such as AWS and Google Cloud, it will need to be available everywhere and it will need to be affordable. That will require new silicon (for both training and inferencing) that can be run at higher densities and at lower power than today’s GPUs.

Looking back at Azure’s Project Brainwave FPGAs, these used programmable silicon to implement key algorithms. While they worked well, they were single-purpose devices that acted as accelerators for specific machine learning models. You could develop a variant that supported the complex neural networks of a LLM, but it would need to implement a massive array of simple processors to support the multi-dimensional vector arithmetic that drives these semantic models. That’s beyond the capabilities of most FPGA technologies.

See also  Xylem's sustainable water solutions for Europe's data centres

Vector processing is something that modern GPUs are very good at (not surprisingly, as many of the original architects began their careers developing vector processing hardware for early supercomputers). A GPU is basically an array of simple processors that work with matrices and vectors, using technologies like Nvidia’s CUDA to provide access to linear algebra functions that aren’t commonly part of a CPU’s instruction set. The resulting acceleration lets us build and use modern AI models like LLMs.

Microsoft’s new custom AI accelerator chip, Maia 100, is designed for both training and inference. Building on lessons learned running OpenAI workloads, Maia is intended to fit alongside existing Azure infrastructure, as part of a new accelerator rack unit that sits alongside existing compute racks. With over 100 billion transistors delivered by a five-nanometer process, the Maia 100 is certainly a very large and very dense chip, with much more compute capability than a GPU.

The development of the Maia was refined alongside OpenAI’s models, and uses a new rack design that includes custom liquid-based cooling elements. That last part is key to delivering AI workloads to more than the largest Azure data centers. Adding liquid cooling infrastructure is expensive, so putting it in the Maia 100 racks ensures that it can be dropped into any data center, anywhere in the world.

Installing Maia 100 racks does require readjusting rack spacing, as the cooling system makes them larger than Azure’s typical 21-inch racks, which are sized for Open Compute Project servers. In addition to the liquid cooling hardware, the extra space is used for 4.8 Tb high-bandwidth interconnects, essential for pushing large amounts of data between CPUs and accelerators.

See also  Former Atos CEO Says Problems Emerged After He Quit to Join EU | DCN

There are still questions about how applications will get to use the new chips. Absent additional details, it’s likely that they’ll run Microsoft-provided AI models, like OpenAI’s and Hugging Face’s, as well as their own Cognitive Services and the Phi small language models. If they become available to train your own models, expect to see a new class of virtual machines alongside the current range of GPU options in Azure AI Studio.

Cobalt 100: Azure’s own Arm processor

Alongside the unveiling of Maia, Microsoft announced its own Arm server processor, the Cobalt 100. This is a 128-core 64-bit processor, designed to support high-density, low-power applications, based on Arm’s Neoverse reference design. Azure is already using Arm processors for some of its platform services, and Cobalt 100 is likely to support these and more services, rather than being used for infrastructure as a service.

There’s no need to know if your Azure App Service code is running on Intel, AMD, or Arm, as long as it performs well and your users get the results they expect. We can expect to see Cobalt processors running internet-facing services, where density and power efficiency are important requirements, as well as hosting elements of Azure’s content delivery network outside of its main data centers.

Microsoft describes its silicon engineering as a way of delivering a “systems approach” to its Azure data centers, with end-to-end support from its initial storage and networking offerings to its own compute services. And it’s not only Azure. Better silicon is coming to Windows too, as NPU-enabled processors from Intel and Qualcomm start to arrive in 2024’s desktops and laptops. After many years of software leading hardware, it will be interesting to see how we can push these new platforms to their limits with code.

Copyright © 2024 IDG Communications, .

Contents
Azure Boost: Silicon for virtualization offloadMaia 100: Silicon for large language modelsCobalt 100: Azure’s own Arm processor

Source link

TAGGED: Azure, custom, means, Microsofts, Silicon
Share This Article
Twitter Email Copy Link Print
Previous Article Asia data centers are poised for rapid growth Investors Plow Into Asia Data Centers on AI, Cloud Boom | DCN
Next Article Apkudo Apkudo Acquires Mobile reCell
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Your Trusted Source for Accurate and Timely Updates!

Our commitment to accuracy, impartiality, and delivering breaking news as it happens has earned us the trust of a vast audience. Stay ahead with real-time updates on the latest events, trends.
FacebookLike
TwitterFollow
InstagramFollow
YoutubeSubscribe
LinkedInFollow
MediumFollow
- Advertisement -
Ad image

Popular Posts

Arista targets lateral security threat in campus and data center networks

Along with the stateless wire-speed element, Arista MSS can combine with firewalls and cloud proxies…

May 1, 2024

US Manufacturing Needs an ‘AI Backbone’ to Compete | DCN

The latest increase in generative AI holds the promise of catalyzing a brand new period…

March 13, 2024

US Data Center Construction Industry Report 2024-2029

Firm BrandU.S. Information Middle Building MarketU.S. Information Middle Building MarketDublin, Might 23, 2024 (GLOBE NEWSWIRE)…

May 23, 2024

Veea and Genesys Impact launch edge-only AI platform for construction site safety and asset tracking

Veea and Genesys Affect launched an Synthetic Intelligence (AI) based mostly, on-premise Security and Asset…

September 1, 2025

Google’s native multimodal AI image generation in Gemini 2.0 Flash impresses with fast edits, style transfers

Be part of our day by day and weekly newsletters for the most recent updates…

March 13, 2025

You Might Also Like

atNorth's Iceland data centre epitomises circular economy
Cloud Computing

atNorth’s Iceland data centre epitomises circular economy

By saad
How cloud infrastructure shapes the modern Diablo experience 
Cloud Computing

How cloud infrastructure shapes the modern Diablo experience 

By saad
IBM moves to buy Confluent in an $11 billion cloud and AI deal
Cloud Computing

IBM moves to buy Confluent in an $11 billion cloud and AI deal

By saad
What ByteDance's Launch Means for Enterprise
AI

What ByteDance’s Launch Means for Enterprise

By saad
Data Center News
Facebook Twitter Youtube Instagram Linkedin

About US

Data Center News: Stay informed on the pulse of data centers. Latest updates, tech trends, and industry insights—all in one place. Elevate your data infrastructure knowledge.

Top Categories
  • Global Market
  • Infrastructure
  • Innovations
  • Investments
Usefull Links
  • Home
  • Contact
  • Privacy Policy
  • Terms & Conditions

© 2024 – datacenternews.tech – All rights reserved

Welcome Back!

Sign in to your account

Lost your password?
We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will assume that you are happy with it.
You can revoke your consent any time using the Revoke consent button.