Monday, 9 Feb 2026
Subscribe
logo
  • Global
  • AI
  • Cloud Computing
  • Edge Computing
  • Security
  • Investment
  • Sustainability
  • More
    • Colocation
    • Quantum Computing
    • Regulation & Policy
    • Infrastructure
    • Power & Cooling
    • Design
    • Innovations
    • Blog
Font ResizerAa
Data Center NewsData Center News
Search
  • Global
  • AI
  • Cloud Computing
  • Edge Computing
  • Security
  • Investment
  • Sustainability
  • More
    • Colocation
    • Quantum Computing
    • Regulation & Policy
    • Infrastructure
    • Power & Cooling
    • Design
    • Innovations
    • Blog
Have an existing account? Sign In
Follow US
© 2022 Foxiz News Network. Ruby Design Company. All Rights Reserved.
Data Center News > Blog > Global Market > Overcoming the compute trap – Data Centre Review
Global Market

Overcoming the compute trap – Data Centre Review

Last updated: November 22, 2024 12:29 pm
Published November 22, 2024
Share
Building a disaster recovery playbook
SHARE

Narek Tatevosyan, Product Director at Nebius AI, explores how AI startups can increase effectivity by optimising their tech stack and utilizing full-stack platforms.

The present generative AI increase is pushing the boundaries of know-how, remodeling industries — and driving up demand for compute. Many AI start-ups are falling into the ‘compute lure’ and focussing on getting access to the newest, strongest {hardware} no matter the fee, fairly than optimising their current infrastructure or discovering more practical and environment friendly options to constructing GenAI purposes.

Whereas GPU energy will all the time be important to coaching massive AI fashions and different machine studying purposes, it isn’t the one factor that issues. With out state-of-the artwork CPUs, excessive velocity community interface playing cards just like the InfiniBand 400 ND, DDR5 reminiscence, and a motherboard and server rack that may tie all of it collectively, it’s not possible to get most efficiency from an NVIDIA H100 or different top-spec GPUs. In addition to taking a broader view of compute, specializing in a extra holistic strategy to growing AI purposes that features environment friendly knowledge preparation, optimised coaching runs, and utilizing scalable inference infrastructure can mean you can scale and evolve your AI purposes in a sustainable method.

The issue with compute

All else being equal, the extra compute you’ve out there and the bigger your dataset, the extra highly effective the AI fashions you may construct. For instance, Meta’s Llama 3.1 8B and 405B LLMs had been skilled on the identical 15 trillion token dataset utilizing NVIDIA H100s – however the 8B model took 1.46 million GPU hours whereas the considerably extra highly effective 405B model took 30.84 million GPU hours.

See also  DBS, Citi's Banking Services Resume After Data Center Disruption | DCN

In the actual world, in fact, all else is seldom equal, and only a few AI firms have the assets to compete head on with a Meta. As an alternative of falling into the compute lure and making an attempt to match the compute spend of a number of the largest firms on the earth, it may be a more practical aggressive technique to focus holistically on the entire tech stack driving your ML improvement.

It’s additionally value noting that whereas Llama 8B isn’t as highly effective as Llama 405B, it’s nonetheless an efficient and aggressive LLM that outperforms many older, bigger fashions. Whereas Meta clearly used an enormous quantity of compute in growing Llama, the researchers had been innovating aggressively in different areas too.

The total stack benefit

Utilizing a single platform to handle all the pieces – from knowledge preparation and labelling to mannequin coaching, fine-tuning and even inference – comes with a number of benefits.

Growing and deploying an AI utility on a single full-stack supplier means your staff has to be taught to make use of a single set of instruments, fairly than a number of totally different platforms. Equally, your knowledge stays on a single platform so that you don’t must take care of the complexities and inefficiencies of multi-cloud operations. Maybe most usefully, if you happen to run into any points you might be coping with a single assist staff who understands the opposite layers in your stack.

And, in fact, there can be monetary advantages: Through the use of the identical infrastructure for knowledge dealing with, coaching, and inference, you usually tend to get higher pricing out of your infrastructure supplier.

See also  Supermicro in hot water on the accounting front, but enterprise customers more likely to care about products

Past the large three

Whereas hyperscalers like AWS, Microsoft Azure, and Google Cloud may appear the apparent alternative in case you are investing in a single platform, they’ll have downsides for a lot of if not most AI firms.

Most notably, the Massive Three cloud computing platforms are costly. If you happen to function an extremely properly funded start-up or an enormous tech firm, this won’t be a problem – however for almost all of AI firms greater cloud suppliers don’t supply the most effective ROI. Furthermore, they aren’t optimised for AI particular operations, and in consequence you pay vital premiums for options you don’t want.

Devoted full-stack AI platforms like Nebius supply a way more environment friendly resolution. In addition to offering extra reasonably priced compute and {hardware} setups optimised for each coaching and inference, they solely embody the instruments and options wanted for growing AI purposes. You may give attention to growing, coaching and optimising your AI fashions being assured that they’re on the precise {hardware} for the job, not navigating a sprawling server backend or questioning why your costly GPUs aren’t getting the info throughput they need to.

Whereas leveraging a full-stack strategy to ML improvement requires funding, it minimises your ongoing infrastructure prices. Constructing a greater optimised utility from the beginning not solely saves on coaching runs, however also needs to scale back the price of inference. These sorts of saving can compound over a number of generations of AI fashions. A barely extra environment friendly prototype can result in a way more environment friendly manufacturing mannequin, and so forth into the longer term. The selections you make now might be what offers your organization the runway to make it to an IPO.

See also  How data centres can make AI greener

Source link

TAGGED: centre, compute, data, Overcoming, Review, trap
Share This Article
Twitter Email Copy Link Print
Previous Article Microsoft’s AI agents: 4 insights that could reshape the enterprise landscape Microsoft’s AI agents: 4 insights that could reshape the enterprise landscape
Next Article Leonid Capital Partners Leonid Capital Partners Raises Over $265M
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Your Trusted Source for Accurate and Timely Updates!

Our commitment to accuracy, impartiality, and delivering breaking news as it happens has earned us the trust of a vast audience. Stay ahead with real-time updates on the latest events, trends.
FacebookLike
TwitterFollow
InstagramFollow
YoutubeSubscribe
LinkedInFollow
MediumFollow
- Advertisement -
Ad image

Popular Posts

Pioneering AI infrastructure meets renewable energy in Scotland

In a groundbreaking transfer, Argyll Information Growth has joined forces with SambaNova to ascertain the…

October 27, 2025

Edge data is critical to AI — here’s how Dell is helping enterprises unlock its value

Be part of our every day and weekly newsletters for the most recent updates and…

November 12, 2024

HR Path Secures €500M in Financing from Ardian

HR Path, a Paris, France-based firm which focuses on HR consulting, raised €500M in funding…

July 4, 2024

Cambridge Mechatronics Raises Over $40M in Funding

Cambridge Mechatronics, a Cambridge, UK-based firm which specializes within the design and management of Form…

February 14, 2024

ADLINK Technology and Rutronik sign cooperation agreement

ADLINK Technology, a specialist in edge computing, and Rutronik Elektronische Bauelemente GmbH, a worldwide broadline distributor,…

May 17, 2024

You Might Also Like

System administrator typing supercomputer hub disaster recovery plan on laptop to provide fast restoration of service, limiting damage and minimizing interruptions to normal operations
Global Market

8 hot networking trends for 2026

By saad
Shutterstock Germany Only - News - Intel Factory Germany September 2024
Global Market

Intel sets sights on data center GPUs amid AI-driven infrastructure shifts

By saad
Side view of technician or engineer with headset and laptop standing in industrial factory.
Global Market

Is private 5G/6G important after all?

By saad
Levi’s Stadium hosts Super Bowl LX
Global Market

Super Bowl LX raises network expectations

By saad
Data Center News
Facebook Twitter Youtube Instagram Linkedin

About US

Data Center News: Stay informed on the pulse of data centers. Latest updates, tech trends, and industry insights—all in one place. Elevate your data infrastructure knowledge.

Top Categories
  • Global Market
  • Infrastructure
  • Innovations
  • Investments
Usefull Links
  • Home
  • Contact
  • Privacy Policy
  • Terms & Conditions

© 2024 – datacenternews.tech – All rights reserved

Welcome Back!

Sign in to your account

Lost your password?
We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will assume that you are happy with it.
You can revoke your consent any time using the Revoke consent button.