Sunday, 14 Dec 2025
Subscribe
logo
  • Global
  • AI
  • Cloud Computing
  • Edge Computing
  • Security
  • Investment
  • Sustainability
  • More
    • Colocation
    • Quantum Computing
    • Regulation & Policy
    • Infrastructure
    • Power & Cooling
    • Design
    • Innovations
    • Blog
Font ResizerAa
Data Center NewsData Center News
Search
  • Global
  • AI
  • Cloud Computing
  • Edge Computing
  • Security
  • Investment
  • Sustainability
  • More
    • Colocation
    • Quantum Computing
    • Regulation & Policy
    • Infrastructure
    • Power & Cooling
    • Design
    • Innovations
    • Blog
Have an existing account? Sign In
Follow US
© 2022 Foxiz News Network. Ruby Design Company. All Rights Reserved.
Data Center News > Blog > Cloud Computing > Model quantization and the dawn of edge AI
Cloud Computing

Model quantization and the dawn of edge AI

Last updated: January 22, 2024 5:19 am
Published January 22, 2024
Share
shutterstock 434825713 big and small light bulbs
SHARE

The convergence of artificial intelligence and edge computing promises to be transformative for many industries. Here the rapid pace of innovation in model quantization, a technique that results in faster computation by improving portability and reducing model size, is playing a pivotal role.

Model quantization bridges the gap between the computational limitations of edge devices and the demands of deploying highly accurate models for faster, more efficient, and more cost-effective edge AI solutions. Breakthroughs like generalized post-training quantization (GPTQ), low-rank adaptation (LoRA), and quantized low-rank adaptation (QLoRA) have the potential to foster real-time analytics and decision-making at the point where data is generated.

Edge AI, when combined with the right tools and techniques, could redefine the way we interact with data and data-driven applications.

Why edge AI?

The purpose of edge AI is to bring data processing and models closer to where data is generated, such as on a remote server, tablet, IoT device, or smartphone. This enables low-latency, real-time AI. According to Gartner, more than half of all data analysis by deep neural networks will happen at the edge by 2025. This paradigm shift will bring multiple advantages:

  • Reduced latency: By processing data directly on the device, edge AI reduces the need to transmit data back and forth to the cloud. This is critical for applications that depend on real-time data and require rapid responses.
  • Reduced costs and complexity: Processing data locally at the edge eliminates expensive data transfer costs to send information back and forth. 
  • Privacy preservation: Data remains on the device, reducing security risks associated with data transmission and data leakage. 
  • Better scalability: The decentralized approach with edge AI makes it easier to scale applications without relying on a central server for processing power.
See also  $7bn plan reveals how Chinese firms navigate US restrictions

For example, a manufacturer can implement edge AI into its processes for predictive maintenance, quality control, and defect detection. By running AI and analyzing data locally from smart machines and sensors, manufacturers can make better use of real-time data to reduce downtime and improve production processes and efficiency.

The role of model quantization

For edge AI to be effective, AI models need to be optimized for performance without compromising accuracy. AI models are becoming more intricate, more complex, and larger, making them harder to handle. This creates challenges for deploying AI models at the edge, where edge devices often have limited resources and are constrained in their ability to support such models.

Model quantization reduces the numerical precision of model parameters (from 32-bit floating point to 8-bit integer, for example), making the models lightweight and suitable for deployment on resource-constrained devices such as mobile phones, edge devices, and embedded systems. 

Three techniques have emerged as potential game changers in the domain of model quantization, namely GPTQ, LoRA, and QLoRA:

  • GPTQ involves compressing models after they’ve been trained. It’s ideal for deploying models in environments with limited memory. 
  • LoRA involves fine-tuning large pre-trained models for inferencing. Specifically, it fine-tunes smaller matrices (known as a LoRA adapter) that make up the large matrix of a pre-trained model.
  • QLoRA is a more memory-efficient option that leverages GPU memory for the pre-trained model. LoRA and QLoRA are especially beneficial when adapting models to new tasks or data sets with restricted computational resources.

Selecting from these methods depends heavily on the project’s unique requirements, whether the project is at the fine-tuning stage or deployment, and whether it has the computational resources at its disposal. By using these quantization techniques, developers can effectively bring AI to the edge, creating a balance between performance and efficiency, which is critical for a wide range of applications.

See also  The MSPs winning are the ones evolving

Edge AI use cases and data platforms

The applications of edge AI are vast. From smart cameras that process images for rail car inspections at train stations, to wearable health devices that detect anomalies in the wearer’s vitals, to smart sensors that monitor inventory on retailers’ shelves, the possibilities are boundless. That’s why IDC forecasts edge computing spending to reach $317 billion in 2028. The edge is redefining how organizations process data.

As organizations recognize the benefits of AI inferencing at the edge, the demand for robust edge inferencing stacks and databases will surge. Such platforms can facilitate local data processing while offering all of the advantages of edge AI, from reduced latency to heightened data privacy. 

For edge AI to thrive, a persistent data layer is essential for local and cloud-based management, distribution, and processing of data. With the emergence of multimodal AI models, a unified platform capable of handling various data types becomes critical for meeting edge computing’s operational demands. A unified data platform enables AI models to seamlessly access and interact with local data stores in both online and offline environments. Additionally, distributed inferencing—where models are trained across several devices holding local data samples without actual data exchange—promises to alleviate current data privacy and compliance issues. 

As we move towards intelligent edge devices, the fusion of AI, edge computing, and edge database management will be central to heralding an era of fast, real-time, and secure solutions. Looking ahead, organizations can focus on implementing sophisticated edge strategies for efficiently and securely managing AI workloads and streamlining the use of data within their business.

See also  Turkcell boosts streaming experience in Turkey with Qwilt’s edge cloud and Cisco infrastructure

Rahul Pradhan is VP of product and strategy at Couchbase, a provider of a modern database for enterprise applications that 30% of the Fortune 100 depend on. Rahul has over 20 years of experience leading and managing engineering and product teams focusing on databases, storage, networking, and security technologies in the cloud.

—

Generative AI Insights provides a venue for technology leaders—including vendors and other outside contributors—to explore and discuss the challenges and opportunities of generative artificial intelligence. The selection is wide-ranging, from technology deep dives to case studies to expert opinion, but also subjective, based on our judgment of which topics and treatments will best serve InfoWorld’s technically sophisticated audience. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Contact doug_dineley@foundryco.com.

Copyright © 2023 IDG Communications, .

Contents
Why edge AI?The role of model quantizationEdge AI use cases and data platforms

Source link

TAGGED: dawn, edge, Model, quantization
Share This Article
Twitter Email Copy Link Print
Previous Article Navigating cloud concentration and AI lock-in Navigating cloud concentration and AI lock-in
Next Article Cloud Computing News Victoria’s Secret to create AI-powered shopping experiences with Google Cloud
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Your Trusted Source for Accurate and Timely Updates!

Our commitment to accuracy, impartiality, and delivering breaking news as it happens has earned us the trust of a vast audience. Stay ahead with real-time updates on the latest events, trends.
FacebookLike
TwitterFollow
InstagramFollow
YoutubeSubscribe
LinkedInFollow
MediumFollow
- Advertisement -
Ad image

Popular Posts

Zededa expands executive bench as edge AI adoption accelerates across industries

Zededa, a supplier of edge administration and orchestration software program, expanded its management group with…

August 7, 2025

BlackRock Delivers Unprecedented Fed Alert Following a $300 Billion Plunge in Bitcoin and Other Crypto Assets

Since early June, there’s been a dramatic flip of occasions that has seen the crypto…

July 3, 2024

Upwind Acquires Nyx

Upwind, a San Francisco, CA-based cloud safety supplier, acquired Nyx Safety, a Tel Aviv, Israel-based…

April 29, 2025

Foxconn builds AI factory in partnership with Taiwan and Nvidia

Nvidia and Foxconn Hon Hai Expertise Group right this moment introduced they're deepening their longstanding…

May 19, 2025

A new data center could bring in billions of investments in Pittslyvania County

PITTSYLVANIA CO., Va. (WDBJ) - A brand new knowledge middle could possibly be coming after…

June 6, 2024

You Might Also Like

atNorth's Iceland data centre epitomises circular economy
Cloud Computing

atNorth’s Iceland data centre epitomises circular economy

By saad
How cloud infrastructure shapes the modern Diablo experience 
Cloud Computing

How cloud infrastructure shapes the modern Diablo experience 

By saad
Armada demonstrates real edge compute capability in contested maritime environments
Edge Computing

Armada demonstrates real edge compute capability in contested maritime environments

By saad
Quetta Data Centers pioneers sustainable edge expansion in Spain
Infrastructure

Quetta Data Centers pioneers sustainable edge expansion in Spain

By saad
Data Center News
Facebook Twitter Youtube Instagram Linkedin

About US

Data Center News: Stay informed on the pulse of data centers. Latest updates, tech trends, and industry insights—all in one place. Elevate your data infrastructure knowledge.

Top Categories
  • Global Market
  • Infrastructure
  • Innovations
  • Investments
Usefull Links
  • Home
  • Contact
  • Privacy Policy
  • Terms & Conditions

© 2024 – datacenternews.tech – All rights reserved

Welcome Back!

Sign in to your account

Lost your password?
We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will assume that you are happy with it.
You can revoke your consent any time using the Revoke consent button.