Tuesday, 10 Feb 2026
Subscribe
logo
  • Global
  • AI
  • Cloud Computing
  • Edge Computing
  • Security
  • Investment
  • Sustainability
  • More
    • Colocation
    • Quantum Computing
    • Regulation & Policy
    • Infrastructure
    • Power & Cooling
    • Design
    • Innovations
    • Blog
Font ResizerAa
Data Center NewsData Center News
Search
  • Global
  • AI
  • Cloud Computing
  • Edge Computing
  • Security
  • Investment
  • Sustainability
  • More
    • Colocation
    • Quantum Computing
    • Regulation & Policy
    • Infrastructure
    • Power & Cooling
    • Design
    • Innovations
    • Blog
Have an existing account? Sign In
Follow US
© 2022 Foxiz News Network. Ruby Design Company. All Rights Reserved.
Data Center News > Blog > Global Market > Overcoming the compute trap – Data Centre Review
Global Market

Overcoming the compute trap – Data Centre Review

Last updated: November 22, 2024 12:29 pm
Published November 22, 2024
Share
Building a disaster recovery playbook
SHARE

Narek Tatevosyan, Product Director at Nebius AI, explores how AI startups can increase effectivity by optimising their tech stack and utilizing full-stack platforms.

The present generative AI increase is pushing the boundaries of know-how, remodeling industries — and driving up demand for compute. Many AI start-ups are falling into the ‘compute lure’ and focussing on getting access to the newest, strongest {hardware} no matter the fee, fairly than optimising their current infrastructure or discovering more practical and environment friendly options to constructing GenAI purposes.

Whereas GPU energy will all the time be important to coaching massive AI fashions and different machine studying purposes, it isn’t the one factor that issues. With out state-of-the artwork CPUs, excessive velocity community interface playing cards just like the InfiniBand 400 ND, DDR5 reminiscence, and a motherboard and server rack that may tie all of it collectively, it’s not possible to get most efficiency from an NVIDIA H100 or different top-spec GPUs. In addition to taking a broader view of compute, specializing in a extra holistic strategy to growing AI purposes that features environment friendly knowledge preparation, optimised coaching runs, and utilizing scalable inference infrastructure can mean you can scale and evolve your AI purposes in a sustainable method.

The issue with compute

All else being equal, the extra compute you’ve out there and the bigger your dataset, the extra highly effective the AI fashions you may construct. For instance, Meta’s Llama 3.1 8B and 405B LLMs had been skilled on the identical 15 trillion token dataset utilizing NVIDIA H100s – however the 8B model took 1.46 million GPU hours whereas the considerably extra highly effective 405B model took 30.84 million GPU hours.

See also  Equinix, AWS embrace liquid cooling to power AI implementations

In the actual world, in fact, all else is seldom equal, and only a few AI firms have the assets to compete head on with a Meta. As an alternative of falling into the compute lure and making an attempt to match the compute spend of a number of the largest firms on the earth, it may be a more practical aggressive technique to focus holistically on the entire tech stack driving your ML improvement.

It’s additionally value noting that whereas Llama 8B isn’t as highly effective as Llama 405B, it’s nonetheless an efficient and aggressive LLM that outperforms many older, bigger fashions. Whereas Meta clearly used an enormous quantity of compute in growing Llama, the researchers had been innovating aggressively in different areas too.

The total stack benefit

Utilizing a single platform to handle all the pieces – from knowledge preparation and labelling to mannequin coaching, fine-tuning and even inference – comes with a number of benefits.

Growing and deploying an AI utility on a single full-stack supplier means your staff has to be taught to make use of a single set of instruments, fairly than a number of totally different platforms. Equally, your knowledge stays on a single platform so that you don’t must take care of the complexities and inefficiencies of multi-cloud operations. Maybe most usefully, if you happen to run into any points you might be coping with a single assist staff who understands the opposite layers in your stack.

And, in fact, there can be monetary advantages: Through the use of the identical infrastructure for knowledge dealing with, coaching, and inference, you usually tend to get higher pricing out of your infrastructure supplier.

See also  Nvidia reportedly acquires Enfabrica CEO and chip technology license

Past the large three

Whereas hyperscalers like AWS, Microsoft Azure, and Google Cloud may appear the apparent alternative in case you are investing in a single platform, they’ll have downsides for a lot of if not most AI firms.

Most notably, the Massive Three cloud computing platforms are costly. If you happen to function an extremely properly funded start-up or an enormous tech firm, this won’t be a problem – however for almost all of AI firms greater cloud suppliers don’t supply the most effective ROI. Furthermore, they aren’t optimised for AI particular operations, and in consequence you pay vital premiums for options you don’t want.

Devoted full-stack AI platforms like Nebius supply a way more environment friendly resolution. In addition to offering extra reasonably priced compute and {hardware} setups optimised for each coaching and inference, they solely embody the instruments and options wanted for growing AI purposes. You may give attention to growing, coaching and optimising your AI fashions being assured that they’re on the precise {hardware} for the job, not navigating a sprawling server backend or questioning why your costly GPUs aren’t getting the info throughput they need to.

Whereas leveraging a full-stack strategy to ML improvement requires funding, it minimises your ongoing infrastructure prices. Constructing a greater optimised utility from the beginning not solely saves on coaching runs, however also needs to scale back the price of inference. These sorts of saving can compound over a number of generations of AI fashions. A barely extra environment friendly prototype can result in a way more environment friendly manufacturing mannequin, and so forth into the longer term. The selections you make now might be what offers your organization the runway to make it to an IPO.

See also  Can AI turn data centres into green champions?

Source link

TAGGED: centre, compute, data, Overcoming, Review, trap
Share This Article
Twitter Email Copy Link Print
Previous Article Microsoft’s AI agents: 4 insights that could reshape the enterprise landscape Microsoft’s AI agents: 4 insights that could reshape the enterprise landscape
Next Article Leonid Capital Partners Leonid Capital Partners Raises Over $265M
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Your Trusted Source for Accurate and Timely Updates!

Our commitment to accuracy, impartiality, and delivering breaking news as it happens has earned us the trust of a vast audience. Stay ahead with real-time updates on the latest events, trends.
FacebookLike
TwitterFollow
InstagramFollow
YoutubeSubscribe
LinkedInFollow
MediumFollow
- Advertisement -
Ad image

Popular Posts

Paula Cogan Resigns as euNetworks CEO, Kevin Dean Appointed Interim

Interim CEO Kevin Dean oversaw product, infrastructure investments, business operations, advertising, and strategic initiatives whereas…

August 17, 2024

Rich Data Co Raises Additional $9M in Series B Funding

Rich Data Co, a Sydney, Australia-based firm which makes a speciality of synthetic intelligence (AI)…

August 10, 2024

The role of edge data centers in edge computing

The rise of distributed structure could be attributed to its advantages, together with decreased latency,…

February 27, 2024

New non-stick coating improves shaping processes in injection molding and die casting

by Martina Ohle, Fraunhofer-Institut für Fertigungstechnik und Angewandte Materialforschung IFAM UltraPLAS permits excellent replica of…

July 21, 2024

Critical Insight 2024: Tor Björn Minde explores data centres in the age of AI

To supply one of the best experiences, we use applied sciences like cookies to retailer…

October 17, 2024

You Might Also Like

Hacker aus China
Global Market

DKnife targets network gateways in long running AitM campaign

By saad
RAM memory stick module exposed and sitting on a circuit board
Global Market

Intel teams with SoftBank to develop new memory type

By saad
Can air cooling survive the AI era?
Global Market

Can air cooling survive the AI era?

By saad
The evolution of Europe's data centre landscape: growth, challenges and sustainability
Power & Cooling

The evolution of Europe’s data centre landscape: growth, challenges and sustainability

By saad
Data Center News
Facebook Twitter Youtube Instagram Linkedin

About US

Data Center News: Stay informed on the pulse of data centers. Latest updates, tech trends, and industry insights—all in one place. Elevate your data infrastructure knowledge.

Top Categories
  • Global Market
  • Infrastructure
  • Innovations
  • Investments
Usefull Links
  • Home
  • Contact
  • Privacy Policy
  • Terms & Conditions

© 2024 – datacenternews.tech – All rights reserved

Welcome Back!

Sign in to your account

Lost your password?
We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will assume that you are happy with it.
You can revoke your consent any time using the Revoke consent button.