Sunday, 8 Feb 2026
Subscribe
logo
  • Global
  • AI
  • Cloud Computing
  • Edge Computing
  • Security
  • Investment
  • Sustainability
  • More
    • Colocation
    • Quantum Computing
    • Regulation & Policy
    • Infrastructure
    • Power & Cooling
    • Design
    • Innovations
    • Blog
Font ResizerAa
Data Center NewsData Center News
Search
  • Global
  • AI
  • Cloud Computing
  • Edge Computing
  • Security
  • Investment
  • Sustainability
  • More
    • Colocation
    • Quantum Computing
    • Regulation & Policy
    • Infrastructure
    • Power & Cooling
    • Design
    • Innovations
    • Blog
Have an existing account? Sign In
Follow US
© 2022 Foxiz News Network. Ruby Design Company. All Rights Reserved.
Data Center News > Blog > Cloud Computing > Most cloud-based genAI performance stinks
Cloud Computing

Most cloud-based genAI performance stinks

Last updated: February 1, 2024 6:07 pm
Published February 1, 2024
Share
Most cloud-based genAI performance stinks
SHARE

I’ve been asked if generative AI systems are always slow. Of course, I reply, “Slow, as compared to what?” The response I always get is funny. “Slower than we thought it would be.” And the circle continues.

Performance is often an afterthought with generative AI development and deployment. Most deploying generative AI systems on the cloud, and even not the cloud, have yet to learn what the performance of their generative AI systems should be, take no steps to determine performance, and end up complaining about the performance after deployment. Or, more often, the users complain, and then generative AI designers and developers complain to me.

Challenges of generative AI performance

At their essence, generative AI systems are complex, distributed data-oriented systems that are challenging to build, deploy, and operate. They are all different, with different moving parts. Most of the parts are distributed everywhere, from the source databases for the training data, to the output data, to the core inference engines that often exist on cloud providers.

Here is my list of the most common difficulties:

Complex deployment landscapes. Generative AI systems often comprise various components. They include data ingestion services, storage, computing, and networking. Architecting these components to work synergistically often leads to overcomplexity, where performance issues, determined by the poorest performing components, are different from isolating. I’ve seen poorly performing networks and saturated databases. Those things are not directly related to generative AI, but they can cause performance problems, nonetheless.

AI model tuning. Performance is not solely a function of infrastructure, which is a conclusion that many reach. The AI models must be tuned and optimized, requiring deep technical expertise that few have.

See also  High Performance Software Foundation Launched to Boost HPC Innovation

Vendors could have done a better job establishing best practices in performance tuning. Many enterprises are concerned that they may worsen things or introduce issues that cause erroneous outcomes. This can’t be ignored, and depending on the type of generative AI system you’re working on in the cloud, you need to figure this out by working with the generative AI service providers.

Security concerns. Protecting AI models and their data against unauthorized access and breaches goes without saying, especially in cloud environments where multitenancy is common. Too many performance issues raise security risks.

In many instances, security mechanisms, such as encryption, introduce performance issues that if not resolved will worsen as the data grows. Architecture and testing are your friends here. Take some time to understand how security affects generative AI performance.

Regulatory compliance. Related to security is adherence to data governance and compliance standards. They can impose additional layers of performance management complexity.

Much like security, we need to figure out how to work with these requirements. Most of the time, we can find a happy medium to provide the compliance we need. As with optimized performance, it just takes some trial and error.

Generative AI best practices

Remember that if I list best practices here, they are holistic. They don’t consider the specific type of generative AI systems you’re running, all of which have very different components and platform considerations. You’ll have to check with your specific generative AI provider about how these are carried out for your particular use cases. Given that warning, here are a few to consider:

See also  Southeast Asia Has $60B AI Boom, But Its Own Startups Are Missing Out

Implement automation for scaling and resource optimization, or autoscaling, which cloud providers provide. This includes using machine learning operations (MLOps) techniques and approaches for operating AI models.

Utilize serverless computing, which abstracts away infrastructure management. This means you no longer must allocate the resources your generative AI will need; it’s done automatically. Although I’m not always okay with turning the keys over to an automated process that will allocate resources that we have to pay for, given all the other things you need to be concerned with, this is one less thing to worry about.

Conduct regular load testing and performance evaluations. Ensure that your generative AI systems can handle peak demands. Most skip this and guess how much the load will be at the top of the curve. Can you say “outage”?

Employ a continuous learning approach. AI models should be regularly updated with new data and refined to maintain performance and relevance.

Tap into the expertise and support of cloud service providers. Also, make sure to monitor online communities supporting your specific technology stack. You’ll find many answers there that $700-an-hour consultants won’t be able to provide.

I suspect that generative AI performance will become an area of focus more than it is today. Perhaps it should be, given the amount of resources and cash we’re focusing on this exploding space.

Copyright © 2024 IDG Communications, .

Contents
Challenges of generative AI performanceGenerative AI best practices

Source link

TAGGED: cloudbased, GenAI, performance, stinks
Share This Article
Twitter Email Copy Link Print
Previous Article Experts from 30 nations will contribute to global AI safety report Experts from 30 nations will contribute to global AI safety report
Next Article KakaoBank Powers AI Innovation at Digital Realty's ICN10 Data Center KakaoBank Powers AI Innovation at Digital Realty’s ICN10 Data Center
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Your Trusted Source for Accurate and Timely Updates!

Our commitment to accuracy, impartiality, and delivering breaking news as it happens has earned us the trust of a vast audience. Stay ahead with real-time updates on the latest events, trends.
FacebookLike
TwitterFollow
InstagramFollow
YoutubeSubscribe
LinkedInFollow
MediumFollow
- Advertisement -
Ad image

Popular Posts

Cloud Toscana Advances Colocation Data Center with Legrand Partnership

Cloud Toscana, a regional colocation information middle in Italy, provides superior digital infrastructure and cloud…

July 10, 2024

TSMC forecasts record growth, rejects US joint venture amid AI surge

Taiwan Semiconductor Manufacturing Firm (TSMC) has raised its income forecast for 2024, citing robust demand…

July 19, 2024

Exiger Acquires XSB

Exiger, a Washington, DC-based provide chain and third-party threat AI firm, acquired logistics intelligence platform XSB.…

August 7, 2024

Plexision Receives $365K from Richard King Mellon Foundation

Plexision, a Pittsburgh, PA-based biotechnology firm creating blood exams, obtained a $365K funding from the Richard…

July 26, 2025

Earthquake Halts TSMC Operations, Construction Trends Uncovered | DCN

With information heart information transferring sooner than ever, we need to make it simple for…

April 5, 2024

You Might Also Like

Alphabet boosts cloud investment to meet rising AI demand
Cloud Computing

Alphabet boosts cloud investment to meet rising AI demand

By saad
On how to get a secure GenAI rollout right
Cloud Computing

On how to get a secure GenAI rollout right

By saad
Snowflake and OpenAI push AI into everyday cloud data work
Cloud Computing

Snowflake and OpenAI push AI into everyday cloud data work

By saad
Nationwide is deepening its use of cloud services with AWS
Cloud Computing

Nationwide is deepening its use of cloud services with AWS

By saad
Data Center News
Facebook Twitter Youtube Instagram Linkedin

About US

Data Center News: Stay informed on the pulse of data centers. Latest updates, tech trends, and industry insights—all in one place. Elevate your data infrastructure knowledge.

Top Categories
  • Global Market
  • Infrastructure
  • Innovations
  • Investments
Usefull Links
  • Home
  • Contact
  • Privacy Policy
  • Terms & Conditions

© 2024 – datacenternews.tech – All rights reserved

Welcome Back!

Sign in to your account

Lost your password?
We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will assume that you are happy with it.
You can revoke your consent any time using the Revoke consent button.