Sunday, 8 Feb 2026
Subscribe
logo
  • Global
  • AI
  • Cloud Computing
  • Edge Computing
  • Security
  • Investment
  • Sustainability
  • More
    • Colocation
    • Quantum Computing
    • Regulation & Policy
    • Infrastructure
    • Power & Cooling
    • Design
    • Innovations
    • Blog
Font ResizerAa
Data Center NewsData Center News
Search
  • Global
  • AI
  • Cloud Computing
  • Edge Computing
  • Security
  • Investment
  • Sustainability
  • More
    • Colocation
    • Quantum Computing
    • Regulation & Policy
    • Infrastructure
    • Power & Cooling
    • Design
    • Innovations
    • Blog
Have an existing account? Sign In
Follow US
© 2022 Foxiz News Network. Ruby Design Company. All Rights Reserved.
Data Center News > Blog > Cloud Computing > Model quantization and the dawn of edge AI
Cloud Computing

Model quantization and the dawn of edge AI

Last updated: January 22, 2024 5:19 am
Published January 22, 2024
Share
shutterstock 434825713 big and small light bulbs
SHARE

The convergence of artificial intelligence and edge computing promises to be transformative for many industries. Here the rapid pace of innovation in model quantization, a technique that results in faster computation by improving portability and reducing model size, is playing a pivotal role.

Model quantization bridges the gap between the computational limitations of edge devices and the demands of deploying highly accurate models for faster, more efficient, and more cost-effective edge AI solutions. Breakthroughs like generalized post-training quantization (GPTQ), low-rank adaptation (LoRA), and quantized low-rank adaptation (QLoRA) have the potential to foster real-time analytics and decision-making at the point where data is generated.

Edge AI, when combined with the right tools and techniques, could redefine the way we interact with data and data-driven applications.

Why edge AI?

The purpose of edge AI is to bring data processing and models closer to where data is generated, such as on a remote server, tablet, IoT device, or smartphone. This enables low-latency, real-time AI. According to Gartner, more than half of all data analysis by deep neural networks will happen at the edge by 2025. This paradigm shift will bring multiple advantages:

  • Reduced latency: By processing data directly on the device, edge AI reduces the need to transmit data back and forth to the cloud. This is critical for applications that depend on real-time data and require rapid responses.
  • Reduced costs and complexity: Processing data locally at the edge eliminates expensive data transfer costs to send information back and forth. 
  • Privacy preservation: Data remains on the device, reducing security risks associated with data transmission and data leakage. 
  • Better scalability: The decentralized approach with edge AI makes it easier to scale applications without relying on a central server for processing power.
See also  Avassa and OnLogic team up to deliver 'industrial IoT edge excellence'

For example, a manufacturer can implement edge AI into its processes for predictive maintenance, quality control, and defect detection. By running AI and analyzing data locally from smart machines and sensors, manufacturers can make better use of real-time data to reduce downtime and improve production processes and efficiency.

The role of model quantization

For edge AI to be effective, AI models need to be optimized for performance without compromising accuracy. AI models are becoming more intricate, more complex, and larger, making them harder to handle. This creates challenges for deploying AI models at the edge, where edge devices often have limited resources and are constrained in their ability to support such models.

Model quantization reduces the numerical precision of model parameters (from 32-bit floating point to 8-bit integer, for example), making the models lightweight and suitable for deployment on resource-constrained devices such as mobile phones, edge devices, and embedded systems. 

Three techniques have emerged as potential game changers in the domain of model quantization, namely GPTQ, LoRA, and QLoRA:

  • GPTQ involves compressing models after they’ve been trained. It’s ideal for deploying models in environments with limited memory. 
  • LoRA involves fine-tuning large pre-trained models for inferencing. Specifically, it fine-tunes smaller matrices (known as a LoRA adapter) that make up the large matrix of a pre-trained model.
  • QLoRA is a more memory-efficient option that leverages GPU memory for the pre-trained model. LoRA and QLoRA are especially beneficial when adapting models to new tasks or data sets with restricted computational resources.

Selecting from these methods depends heavily on the project’s unique requirements, whether the project is at the fine-tuning stage or deployment, and whether it has the computational resources at its disposal. By using these quantization techniques, developers can effectively bring AI to the edge, creating a balance between performance and efficiency, which is critical for a wide range of applications.

See also  Alibaba Cloud and NVIDIA merge AI prowess for next-gen automotive tech

Edge AI use cases and data platforms

The applications of edge AI are vast. From smart cameras that process images for rail car inspections at train stations, to wearable health devices that detect anomalies in the wearer’s vitals, to smart sensors that monitor inventory on retailers’ shelves, the possibilities are boundless. That’s why IDC forecasts edge computing spending to reach $317 billion in 2028. The edge is redefining how organizations process data.

As organizations recognize the benefits of AI inferencing at the edge, the demand for robust edge inferencing stacks and databases will surge. Such platforms can facilitate local data processing while offering all of the advantages of edge AI, from reduced latency to heightened data privacy. 

For edge AI to thrive, a persistent data layer is essential for local and cloud-based management, distribution, and processing of data. With the emergence of multimodal AI models, a unified platform capable of handling various data types becomes critical for meeting edge computing’s operational demands. A unified data platform enables AI models to seamlessly access and interact with local data stores in both online and offline environments. Additionally, distributed inferencing—where models are trained across several devices holding local data samples without actual data exchange—promises to alleviate current data privacy and compliance issues. 

As we move towards intelligent edge devices, the fusion of AI, edge computing, and edge database management will be central to heralding an era of fast, real-time, and secure solutions. Looking ahead, organizations can focus on implementing sophisticated edge strategies for efficiently and securely managing AI workloads and streamlining the use of data within their business.

See also  The MOD's first head of cybersecurity explains how to navigate evolving threats

Rahul Pradhan is VP of product and strategy at Couchbase, a provider of a modern database for enterprise applications that 30% of the Fortune 100 depend on. Rahul has over 20 years of experience leading and managing engineering and product teams focusing on databases, storage, networking, and security technologies in the cloud.

—

Generative AI Insights provides a venue for technology leaders—including vendors and other outside contributors—to explore and discuss the challenges and opportunities of generative artificial intelligence. The selection is wide-ranging, from technology deep dives to case studies to expert opinion, but also subjective, based on our judgment of which topics and treatments will best serve InfoWorld’s technically sophisticated audience. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Contact doug_dineley@foundryco.com.

Copyright © 2023 IDG Communications, .

Contents
Why edge AI?The role of model quantizationEdge AI use cases and data platforms

Source link

TAGGED: dawn, edge, Model, quantization
Share This Article
Twitter Email Copy Link Print
Previous Article Navigating cloud concentration and AI lock-in Navigating cloud concentration and AI lock-in
Next Article Cloud Computing News Victoria’s Secret to create AI-powered shopping experiences with Google Cloud
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Your Trusted Source for Accurate and Timely Updates!

Our commitment to accuracy, impartiality, and delivering breaking news as it happens has earned us the trust of a vast audience. Stay ahead with real-time updates on the latest events, trends.
FacebookLike
TwitterFollow
InstagramFollow
YoutubeSubscribe
LinkedInFollow
MediumFollow
- Advertisement -
Ad image

Popular Posts

AI drives spending on cloud infrastructure, IDC reports

The excessive demand amongst hyperscalers and main service suppliers additionally skewed the info in direction…

April 2, 2024

SentinelOne To Acquire Prompt Security

SentinelOne (NYSE: S), a Mountain View, CA-based AI-native cybersecurity chief, is to accumulate Immediate Safety, a…

August 12, 2025

UK secures £6.3bn investment in critical data centres

4 main US tech companies have dedicated to investing in UK information centres, fuelling Britain’s…

October 16, 2024

Enterprises need subscription-based models to meet edge computing demands, GXI report says

The Global Interconnection Index (GXI) 2024 report (published by Equinix) unveils that by 2026, 80%…

January 26, 2024

Optical invention enhances camera capabilities

Conceptual diagram of the proposed HSP digital camera. Credit score: Science Advances (2024). DOI: 10.1126/sciadv.adp5192…

September 4, 2024

You Might Also Like

Alphabet boosts cloud investment to meet rising AI demand
Cloud Computing

Alphabet boosts cloud investment to meet rising AI demand

By saad
On how to get a secure GenAI rollout right
Cloud Computing

On how to get a secure GenAI rollout right

By saad
Artificial intelligence AI chip on a circuit board illustration
Global Market

Cisco: Infrastructure, trust, model development are key AI challenges

By saad
Snowflake and OpenAI push AI into everyday cloud data work
Cloud Computing

Snowflake and OpenAI push AI into everyday cloud data work

By saad
Data Center News
Facebook Twitter Youtube Instagram Linkedin

About US

Data Center News: Stay informed on the pulse of data centers. Latest updates, tech trends, and industry insights—all in one place. Elevate your data infrastructure knowledge.

Top Categories
  • Global Market
  • Infrastructure
  • Innovations
  • Investments
Usefull Links
  • Home
  • Contact
  • Privacy Policy
  • Terms & Conditions

© 2024 – datacenternews.tech – All rights reserved

Welcome Back!

Sign in to your account

Lost your password?
We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will assume that you are happy with it.
You can revoke your consent any time using the Revoke consent button.