Friday, 20 Mar 2026
Subscribe
logo
  • Global
  • AI
  • Cloud Computing
  • Edge Computing
  • Security
  • Investment
  • Sustainability
  • More
    • Colocation
    • Quantum Computing
    • Regulation & Policy
    • Infrastructure
    • Power & Cooling
    • Design
    • Innovations
    • Blog
Font ResizerAa
Data Center NewsData Center News
Search
  • Global
  • AI
  • Cloud Computing
  • Edge Computing
  • Security
  • Investment
  • Sustainability
  • More
    • Colocation
    • Quantum Computing
    • Regulation & Policy
    • Infrastructure
    • Power & Cooling
    • Design
    • Innovations
    • Blog
Have an existing account? Sign In
Follow US
© 2022 Foxiz News Network. Ruby Design Company. All Rights Reserved.
Data Center News > Blog > AI > Grok-2 gets a speed bump after developers rewrite code
AI

Grok-2 gets a speed bump after developers rewrite code

Last updated: August 24, 2024 6:34 am
Published August 24, 2024
Share
Grok-2 gets a speed bump after developers rewrite code
SHARE

Be part of our every day and weekly newsletters for the newest updates and unique content material on industry-leading AI protection. Study Extra


Elon Musk’s xAI has made waves within the final week with the discharge of its Grok-2 massive language mannequin (LLM) chatbot — accessible via an $8 USD month-to-month subscription on the social community X.

Now, each variations of Grok-2 — Grok-2 and Grok-2 mini, the latter designed to be much less highly effective however quicker — have each elevated the velocity at which they will analyze info and output responses after two builders at xAI rewrite the inference code stack fully within the final three days.

As xAI developer Igor Babuschkin posted this afternoon on the social community X underneath his deal with @ibab:

“Grok 2 mini is now 2x quicker than it was yesterday. Within the final three days @lm_zheng and @MalekiSaeed rewrote our inference stack from scratch utilizing SGLang. This has additionally allowed us to serve the massive Grok 2 mannequin, which requires multi-host inference, at an affordable velocity. Each fashions didn’t simply get quicker, but in addition barely extra correct. Keep tuned for additional velocity enhancements!”

The 2 builders accountable are Lianmin Zheng and Saeed Maleki, in response to Babuschkin’s put up.

To rewrite the inference for Grok-2, they relied on SGLang, an open-source (Apache 2.0 licensed) extremely environment friendly system for executing advanced language mannequin packages, reaching as much as 6.4 occasions greater throughput than present techniques.

SGLang was developed by researchers from Stanford College, the College of California, Berkeley, Texas A&M College and Shanghai Jiao Tong College and integrates a frontend language with a backend runtime to simplify the programming of language mannequin purposes.

See also  Office for Mac running slowly? Here's how to speed it up

The system is flexible, supporting many fashions, together with Llama, Mistral, and LLaVA, and is appropriate with open-weight and API-based fashions like OpenAI’s GPT-4. SGLang’s capability to optimize execution via computerized cache reuse and parallelism inside a single program makes it a strong software for builders working with large-scale language fashions.

Grok-2 and Grok-2-Mini Efficiency Highlights

Moreover, within the newest replace to the third-party Lmsys Chatbot Arena leaderboard that charges AI mannequin efficiency, the principle Grok-2 has secured the #2 spot with a formidable Area Rating of 1293, based mostly on 6686 votes.

This successfully places Grok-2 within the quantity two spot (fittingly) for probably the most highly effective AI fashions on the earth, tied with Google’s Gemini-1.5 Professional mannequin, and simply behind OpenAI’s newest model of ChatGPT-4o.

Grok-2-mini, which has additionally benefited from the current enhancements, has climbed to the #5 place, boasting an Area Rating of 1268 from 7266 votes, simply behind GPT-4o mini and Claude 3.5 Sonnet.

Each fashions are proprietary to xAI, reflecting the corporate’s dedication to advancing AI expertise.

Grok-2 has distinguished itself, significantly in mathematical duties, the place it ranks #1. The mannequin additionally holds sturdy positions throughout numerous different classes, together with Arduous Prompts, Coding, and Instruction-following, the place it constantly ranks close to the highest.

This efficiency locations Grok-2 forward of different distinguished fashions like OpenAI’s GPT-4o (Might 2024), which now ranks #4.

Future Developments

In response to a response by Babuschkin on X, the principle benefit of utilizing Grok-2-mini over the complete Grok-2 mannequin is its enhanced velocity.

See also  SolarWinds launches AI agent to automate IT operations, speed incident response

Sure, that’s the principle purpose for now. We’ll make it even quicker than it’s proper now.

— ibab (@ibab) August 23, 2024

Nevertheless, Babuschkin pledged that xAI would additional enhance the processing velocity of Grok-2-mini, which might make it an much more enticing choice for customers looking for excessive efficiency with decrease computational overhead.

The addition of Grok-2 and Grok-2-mini to the Chatbot Area leaderboard and their subsequent efficiency have garnered vital consideration inside the AI group.

The fashions’ success is a testomony to xAI’s ongoing innovation and its dedication to pushing the boundaries of what AI can obtain.

As xAI continues to refine its fashions, the AI panorama can anticipate additional enhancements in each velocity and accuracy, protecting Grok-2 and Grok-2-mini on the forefront of AI growth.


Source link
TAGGED: bump, Code, developers, Grok2, rewrite, speed
Share This Article
Twitter Email Copy Link Print
Previous Article LiquidStack Unveils CDU-1MW for Direct-to-Chip Liquid Cooling LiquidStack Unveils CDU-1MW for Direct-to-Chip Liquid Cooling
Next Article Sprayable gels could protect buildings during wildfires Sprayable gels could protect buildings during wildfires
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Your Trusted Source for Accurate and Timely Updates!

Our commitment to accuracy, impartiality, and delivering breaking news as it happens has earned us the trust of a vast audience. Stay ahead with real-time updates on the latest events, trends.
FacebookLike
TwitterFollow
InstagramFollow
YoutubeSubscribe
LinkedInFollow
MediumFollow
- Advertisement -
Ad image

Popular Posts

Warning to FortiGate admins: You need to run a compromise assessment now

Whereas the info was apparently collected simply over two years in the past, it's unknown…

January 26, 2025

Orange Charger Raises $6.5M in Seed Funding

Orange Charger, a San Francisco, CA-based supplier of EV charging options, launched $6.5m in seed…

May 15, 2024

How big U.S. bank BNY manages armies of AI agents

Be part of our every day and weekly newsletters for the newest updates and unique…

February 26, 2025

Earnings Roundup: How Major Data Center Players Fared in 1Q23 | DCN

This year's first-quarter earnings were a mixed bag in the data center industry, with some…

January 29, 2024

Agora Raises $50M in Series A funding

Agora, a Jersey Metropolis, NJ-based fintech firm and issuer of the AUSD stablecoin, raised $50M…

July 27, 2025

You Might Also Like

NVIDIA Agent Toolkit Gives Enterprises a Framework to Deploy AI Agents at Scale
AI

NVIDIA Agent Toolkit Gives Enterprises a Framework to Deploy AI Agents at Scale

By saad
A photograph of a row of Ethernet cables plugged into ports, with a warning sign illuminated above one of the ports.
Global Market

Telnet vulnerability opens door to remote code execution as root

By saad
Visa prepares payment systems for AI agent-initiated transactions
AI

Visa prepares payment systems for AI agent-initiated transactions

By saad
For effective AI, insurance needs to get its data house in order
AI

For effective AI, insurance needs to get its data house in order

By saad
Data Center News
Facebook Twitter Youtube Instagram Linkedin

About US

Data Center News: Stay informed on the pulse of data centers. Latest updates, tech trends, and industry insights—all in one place. Elevate your data infrastructure knowledge.

Top Categories
  • Global Market
  • Infrastructure
  • Innovations
  • Investments
Usefull Links
  • Home
  • Contact
  • Privacy Policy
  • Terms & Conditions

© 2024 – datacenternews.tech – All rights reserved

Welcome Back!

Sign in to your account

Lost your password?
We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will assume that you are happy with it.
You can revoke your consent any time using the Revoke consent button.