AI {hardware} startup Cerebras Techniques has launched a brand new, third-generation AI processor that it claims to be the quickest on the earth. The WSE-3 chip doubles the efficiency of its predecessor, which was the earlier file holder, the corporate mentioned at present (March 13).
“As soon as once more, we’ve delivered the most important and quickest AI chip on the planet with the identical dinner plate-size kind issue,” mentioned Andy Hock, Cerebras’ vice chairman of product administration.
The Sunnyvale, California-based startup entered the {hardware} market in 2019 when it launched a super-sized AI chip, known as the Wafer Scale Engine (WSE), which measured eight inches by eight inches. It was 56 instances bigger than the biggest GPU and featured 1.2 trillion transistors and 400,000 computing cores, making it the quickest and largest AI chip out there on the time.
Then in 2021, Cerebras launched the WSE-2, a 7-nanometer chip that doubled the efficiency of the unique with 2.6 trillion transistors and 850,000 cores.
900,000 Cores
The corporate at present almost doubled efficiency once more with the WSE-3 chip, which options 4 million transistors and 900,000 cores, delivering 125 petaflops of efficiency. The brand new 5-nanometer processor powers Cerebras’ new CS-3 AI server, which is designed to coach the biggest AI fashions.
“The CS-3 is an enormous step ahead for us,” Hock informed DCN. “It’s two instances extra efficiency than our CS-2 [server]. So, it’s two instances sooner coaching for giant AI fashions with the identical energy draw, and it’s out there on the similar value [as the CS-2] to our clients.”
Since its launch, Cerebras has positioned itself as a substitute for Nvidia GPU-powered AI programs. The startup’s pitch: as a substitute of utilizing 1000’s of GPUs, they will run their AI coaching on Cerebras {hardware} utilizing considerably fewer chips.
“One [Cerebras] server can do the identical work as 10 racks of GPUs,” mentioned Karl Freund, founder and principal analyst of Cambrian AI Analysis.
The WSE-3 processor powers Cerebras’ new CS-3 AI server, which is designed to coach the biggest AI fashions
Cerebras Makes Inroads Into AI Market
Nvidia dominates the AI market with its GPUs capturing about 85% of the AI chip market, whereas the remaining gamers reminiscent of AMD, Intel, Google, AWS, Microsoft, Cerebras and others have captured about 15%, the analyst mentioned.
Whereas the competitors has not but confirmed that they will steal an enormous chunk of market share from Nvidia, Cerebras has discovered success because it launched its first product 5 years in the past, mentioned Freund, who calls Cerebras probably the most profitable AI startup at present.
“From the start, Cerebras took a really totally different method,” he mentioned. “Everyone else is making an attempt to outdo Nvidia, which is actually laborious to do. Cerebras mentioned, ‘We’re going to construct a complete wafer-scale AI engine,’ which nobody has ever performed. The profit is extremely excessive efficiency.”
Cloud Entry
Cerebras doesn’t earn a living promoting its processors. It makes cash promoting servers that run on these chips, which, in accordance with an organization spokesperson, value thousands and thousands of {dollars} every. Cerebras makes its CS-3 programs out there to clients over the cloud, nevertheless it additionally sells to massive enterprises, authorities companies, and worldwide cloud suppliers.
For instance, Cerebras just lately added healthcare supplier Mayo Clinic to its rising roster of shoppers, which incorporates Argonne Nationwide Laboratory and pharmaceutical large GlaxoSmithKline.
Cerebras in July 2023 additionally introduced it inked a $100 million deal to construct the primary of 9 interconnected, cloud-based AI supercomputers for G42, a expertise holding group based mostly within the United Arab Emirates.
Since then, the 2 firms have built two supercomputers totaling eight exaflops of AI compute. Accessible over the cloud, the supercomputers are optimized for coaching massive language fashions and generative AI fashions and are being utilized by organizations throughout totally different industries for local weather, well being and vitality analysis and different initiatives.
Cerebras and G42 are at the moment constructing a 3rd supercomputer, the Condor Galaxy 3 in Dallas, which will probably be powered by 64 CS-3 programs and can produce eight exaflops of AI compute. By the top of 2024, the businesses plan to finish the 9 supercomputers, which is able to whole 55.6 exaflops of compute.
“The truth that Cerebras has now produced a third-generation Wafer Scale Engine is a testomony to its buyer traction. They generated the form of income they wanted to pay for all that engineering,” Freund mentioned.
In Numbers: WSE-3 Chip and CS-3 AI System
Cerebras’ WSE-3 options 52 instances extra cores than Nvidia’s H100 Tensor Core. When in comparison with an Nvidia DGX H100 system, the Cerebras CS-3 system – powered by the WSE-3 chip – performs coaching eight instances sooner, options 1,900 instances extra reminiscence and might prepare AI fashions as much as 24 trillion parameters, which is 600 instances bigger than a DGX H100’s capabilities, Cerebras executives mentioned.
A Llama 70 billion parameter mannequin that takes 30 days to coach on GPUs may be educated in at some point utilizing a CS-3 cluster, Hock mentioned.
Cerebras Companions with Qualcomm on AI Inferencing
As a result of Cerebras’ {hardware} focuses on AI coaching, it beforehand didn’t have a solution for purchasers’ AI inferencing wants. Now it does because of a brand new partnership with Qualcomm.
The 2 firms at present mentioned they’ve collaborated, in order that the fashions educated on Cerebras’ {hardware} are optimized to run inferencing on Qualcomm’s Cloud A100 Extremely accelerator.
“They optimized the output of the large CS-3 machines to run rather well on these very low-cost, low-power Qualcomm AI inferencing engines,” Freund mentioned.