Crimson Hat has launched the Crimson Hat AI Inference Server, which permits enterprises to run generative AI functions quicker and extra effectively, the corporate introduced at this time (Could 20).
Launched at this week’s Crimson Hat Summit in Boston, the brand new AI inference server software program builds upon the open supply vLLM project and incorporates expertise from Crimson Hat’s latest acquisition of startup Neural Magic.
It options instruments that compress skilled AI fashions so that they run extra effectively. It additionally makes extra environment friendly use of processor reminiscence, enabling quicker inferencing throughout hybrid cloud environments, the corporate stated.
In accordance with business analysts, the corporate’s actions spotlight how AI acceleration encompasses not simply quick processors but in addition optimized software program.
“AI places loads of stress on computing programs, and with the appearance of AI brokers, it can put much more stress sooner or later,” Rick Villars, IDC’s group vice chairman of worldwide analysis, advised DCN. “Crimson Hat is saying they need to allow you to optimize your investments.
“As you go from mannequin constructing to embedding it to what you are promoting processes or buyer experiences, they are going to do every little thing they will on the software program degree to be sure you get most efficiency.”
Optimized AI Fashions
The Crimson Hat AI Inference Server accelerates inferencing, which means it supplies quicker generative AI mannequin responses and handles extra customers concurrently with out requiring further {hardware}, the corporate stated.
The software program does so by optimizing using GPUs by way of methods akin to higher reminiscence administration and steady batching. A Crimson Hat spokesperson stated the expertise can optimize AMD and Nvidia GPUs, Intel’s Gaudi AI accelerators and Google TPUs.
The AI inference server may also be used to optimize AI fashions, akin to DeepSeek, Google’s Gemma, Meta’s open supply Llama, Mistral, Microsoft’s Phi and different massive language fashions.
Crimson Hat makes validated and optimized AI fashions accessible on Hugging Face, the corporate stated.
“Pre-optimized fashions working on vLLM typically ship two to 4 occasions extra token manufacturing – so a a lot larger degree of effectivity,” stated Brian Stevens, Crimson Hat’s senior vice chairman and AI chief expertise officer, throughout a media briefing.
Earlier than vLLM launched two years in the past, inference server choices have been restricted, however Nvidia supplied one in its software program stack, Stevens stated. Now, vLLM has gained traction due to its ease of use, the power to run fashions from Hugging Face, its OpenAI-compatible interface and its help for a number of AI accelerators.
The AI Inference Server, which is Crimson Hat’s implementation of vLLM, could be deployed as a standalone containerized providing. It may also be deployed as an built-in element of Crimson Hat’s AI software program portfolio. That features Crimson Hat Enterprise Linux AI, a model of the open supply OS tailor-made for AI, and Crimson Hat OpenShift AI, a platform for constructing and deploying AI functions in containerized Kubernetes environments on-premises and within the cloud.
Virtualization Market Development
Throughout this week’s Crimson Hat Summit, executives stated they’ve seen greater than 150% progress in Crimson Hat OpenShift Virtualization deployments since 2024.
To draw extra virtualization prospects, Crimson Hat stated Google Cloud, Microsoft Azure, and Oracle Cloud Infrastructure are making Crimson Hat OpenShift Virtualization accessible as expertise or public previews.
The corporate additionally introduced the overall availability of its virtualization software program on Amazon Internet Providers (AWS) and IBM Cloud.
“Clients, when they’re selecting their next-generation virtualization platform, need to go wherever their infrastructure selection leads them, and we needed to actually hone out and construct out these relationships with our cloud suppliers,” stated Mike Barrett, vice chairman and normal supervisor of Crimson Hat’s Hybrid Cloud Platforms, in a media briefing.
Jim Mercer, IDC’s program vice chairman of software program growth, DevOps and DevSecOps, stated Crimson Hat has put loads of effort in bettering its virtualization software program. And whereas the corporate shouldn’t be saying it’s matching rival Broadcom characteristic for characteristic, Crimson Hat is implying that it gives a lot of the main virtualization options that prospects need.
“Quite a lot of prospects who’ve Crimson Hat OpenShift even have VMware vSphere, So Crimson Hat already has a foothold,” Mercer stated. “Crimson Hat is making an attempt to reap the benefits of the truth that, ‘You recognize us as associate. We’re going that can assist you with the migration, and we’re going to make the migration as simple as doable for you.’”
Crimson Hat Summit 2025: Extra Key Bulletins
On the Crimson Hat Summit, the corporate additionally introduced:
-
Crimson Hat Enterprise Linux 10. The brand new OS, accessible at this time, contains new safety features that defend towards assaults from future quantum computer systems. The ‘picture mode’ characteristic permits the OS to be deployed as a bootable container picture. By containerizing the OS and functions, enterprises can streamline administration utilizing the identical, constant instruments and workflow, the corporate stated.
-
New llm-d open supply group to scale inferencing. Crimson Hat introduced the launch of llm-d group, whose objective is to leverage vLLM and scale inferencing by way of a distributed strategy. Founding contributors are CoreWeave, Google, IBM Analysis, and Nvidia. Different members embrace AMD, Cisco, Intel, Lambda, and Mistral AI.
-
Lightspeed generative AI assistants. To deal with the abilities hole, Crimson Hat launched Lightspeed in Enterprise Linux 10, permitting IT directors to make use of pure language to get help for every little thing from troubleshooting frequent issues to managing advanced environments. In June, Crimson Hat will launch OpenShift Lightspeed, a generative AI assistant for managing and troubleshooting the OpenShift setting.
-
Crimson Hat Superior Developer Suite. Crimson Hat introduced the Superior Developer Suite, which mixes platform engineering instruments and safety capabilities.
-
Extra cloud information. Crimson Hat OpenShift is now accessible on Oracle Cloud Infrastructure, whereas Crimson Hat AI Inference Server is offered on Google Cloud.
