Amazon Net Providers has scored one other main win for its customized AWS Trainium accelerators after putting a cope with AI video startup Decart. The partnership will see Decart optimise its flagship Lucy model on AWS Trainium3 to help real-time video technology, and spotlight the rising reputation of AI accelerators over Nvidia’s graphics processing items.
Decart is actually going all-in on AWS, and as a part of the deal, the corporate may also make its fashions out there via the Amazon Bedrock platform. Builders can combine Decart’s real-time video technology capabilities into nearly any cloud utility with out worrying about underlying infrastructure.
The distribution via Bedrock will increase AWS’s plug-and-play capabilities, demonstrating Amazon’s confidence in rising demand for real-time AI video. It additionally permits Decart to develop attain and develop adoption among the many developer group. AWS Trainium gives Lucy with the additional processing grunt wanted to generate high-fidelity video with out sacrificing high quality or latency.
Customized AI accelerators like Trainium provide an alternative to Nvidia’s GPUs for AI workloads. Whereas Nvidia nonetheless dominates the AI market, its GPUs processing the overwhelming majority of AI workloads, it’s dealing with a rising risk from customized processors.
Why all of the fuss over AI accelerators?
AWS Trainium isn’t the one choice builders have. Google’s Tensor Processing Unit (TPU) product line and Meta’s Coaching and Inference Accelerator (MTIA) chips are different examples of customized silicon, every having an identical benefit over Nvidia’s GPUs – their ASIC structure (Software-Particular Built-in Circuit). Because the identify suggests, ASIC {hardware} is engineered particularly to deal with one type of utility and accomplish that extra effectively than basic goal processors.
Whereas central processing items are typically thought of to be the Swiss Military knife of the computing world as a result of their capacity to deal with a number of functions, GPUs are extra akin to a robust electrical drill. They’re vastly extra highly effective than CPUs, designed to course of huge quantities of repetitive, parallel computations, making them appropriate for AI functions and graphics rendering duties.
If the GPU is an influence drill, the ASIC is perhaps thought of a scalpel, designed for terribly exact procedures. When constructing ASICs, chipmakers strip out all useful items irrelevant to the duty for higher effectivity – all their operations are devoted to the duty.
This yields huge efficiency and power effectivity advantages in comparison with GPUs, and will clarify their rising reputation. A living proof is Anthropic, which has partnered with AWS on Undertaking Rainier, an infinite cluster made up of lots of of hundreds of Trainium2 processors.
Anthropic says that Undertaking Rainier will present it with lots of of exaflops of computing energy to run its most superior AI fashions, together with Claude Opus-4.5.
The AI coding startup Poolside can be using AWS Trainium2 to coach its fashions, and has plans to make use of its infrastructure for inference as effectively in future. In the meantime, Anthropic is hedging its bets, additionally seeking to practice future Claude fashions on a cluster of as much as a million Google TPUs. Meta Platforms is reportedly collaborating with Broadcom to develop a customized AI processor to coach and run its Llama fashions, and OpenAI has similar plans.
The Trainium benefit
Decart selected AWS Trainium2 as a result of its efficiency, which let Decart obtain the low latency required by real-time video fashions. Lucy has a time-to-first-frame of 40ms, that means that it begins producing video nearly immediately after immediate. By streamlining video processing on Trainium, Lucy also can match the standard of a lot slower, extra established video fashions like OpenAI’s Sora 2 and Google’s Veo-3, with Decart producing output at as much as 30 fps.
Decart believes Lucy will enhance. As a part of its settlement with AWS, the corporate has obtained early entry to the newly introduced Trainium3 processor, able to outputs of as much as 100 fps and decrease latency. “Trainium3’s next-generation structure delivers larger throughput, decrease latency, and higher reminiscence effectivity – permitting us to realize as much as 4x sooner body technology at half the price of GPUs,” stated Decart co-founder and CEO Dean Leitersdorf in a press release.
Nvidia may not be too nervous about customized AI processors. The AI chip large is reported to be designing its own ASIC chips to rival cloud opponents’. Furthermore, ASICs aren’t going to switch GPUs fully, as every chip has its personal strengths. The flexibleness of GPUs means they continue to be the one actual choice for general-purpose fashions like GPT-5 and Gemini 3, and are nonetheless dominant in AI coaching. Nonetheless, many AI functions have secure processing necessities, that means they’re significantly suited to working on ASICs.
The rise of customized AI processors is anticipated to have a profound impression on the business. By pushing chip design in the direction of higher customisation and enhancing the efficiency of specialized functions, they’re setting the stage for a brand new wave of AI innovation, with real-time video on the forefront.
Photograph courtesy AWS re:invent
