Thursday, 7 May 2026
Subscribe
logo
  • AI Compute
  • Infrastructure
  • Power & Cooling
  • Security
  • Colocation
  • Cloud Computing
  • More
    • Sustainability
    • Industry News
    • About Data Center News
    • Terms & Conditions
Font ResizerAa
Data Center NewsData Center News
Search
  • AI Compute
  • Infrastructure
  • Power & Cooling
  • Security
  • Colocation
  • Cloud Computing
  • More
    • Sustainability
    • Industry News
    • About Data Center News
    • Terms & Conditions
Have an existing account? Sign In
Follow US
© 2022 Foxiz News Network. Ruby Design Company. All Rights Reserved.
Data Center News > Blog > AI & Compute > MiniMax unveils open source LLM with staggering 4M token context
AI & Compute

MiniMax unveils open source LLM with staggering 4M token context

Last updated: January 15, 2025 2:39 am
Published January 15, 2025
Share
MiniMax unveils open source LLM with staggering 4M token context
SHARE

Be part of our day by day and weekly newsletters for the newest updates and unique content material on industry-leading AI protection. Study Extra


MiniMax is maybe as we speak finest recognized right here within the U.S. because the Singaporean firm behind Hailuo, a sensible, high-resolution generative AI video mannequin that competes with Runway, OpenAI’s Sora and Luma AI’s Dream Machine.

However the firm has way more methods up its sleeve: As we speak, for example, it introduced the discharge and open-sourcing of the MiniMax-01 series, a brand new household of fashions constructed to deal with ultra-long contexts and improve AI agent growth.

The sequence consists of MiniMax-Textual content-01, a basis giant language mannequin (LLM), and MiniMax-VL-01, a visible multi-modal mannequin.

A large context window

MiniMax-Textual content-o1, is of explicit observe for enabling as much as 4 million tokens in its context window — equal to a small library’s worth of books. The context window is how a lot data the LLM can deal with in one input/output exchange, with phrases and ideas represented as numerical “tokens,” the LLM’s personal inside mathematical abstraction of the information it was educated on.

And, whereas Google beforehand led the pack with its Gemini 1.5 Professional mannequin and 2 million token context window, MiniMax remarkably doubled that.

As MiniMax posted on its official X account today: “MiniMax-01 effectively processes as much as 4M tokens — 20 to 32 instances the capability of different main fashions. We imagine MiniMax-01 is poised to help the anticipated surge in agent-related functions within the coming yr, as brokers more and more require prolonged context dealing with capabilities and sustained reminiscence.”

See also  Baidu just dropped an open-source multimodal AI that it claims beats GPT-5 and Gemini

The fashions can be found now for obtain on Hugging Face and Github beneath a custom MiniMax license, for customers to strive instantly on Hailuo AI Chat (a ChatGPT/Gemini/Claude competitor), and thru MiniMax’s application programming interface (API), the place third-party builders can hyperlink their very own distinctive apps to them.

MiniMax is providing APIs for textual content and multi-modal processing at aggressive charges:

  • $0.2 per 1 million enter tokens
  • $1.1 per 1 million output tokens

For comparability, OpenAI’s GPT-4o prices $2.50 per 1 million input tokens by way of its API, a staggering 12.5X costlier.

MiniMax has additionally built-in a combination of specialists (MoE) framework with 32 specialists to optimize scalability. This design balances computational and reminiscence effectivity whereas sustaining aggressive efficiency on key benchmarks.

Putting new floor with Lightning Consideration Structure

On the coronary heart of MiniMax-01 is a Lightning Consideration mechanism, an revolutionary different to transformer structure.

This design considerably reduces computational complexity. The fashions include 456 billion parameters, with 45.9 billion activated per inference.

In contrast to earlier architectures, Lightning Consideration employs a mixture of linear and conventional SoftMax layers, reaching near-linear complexity for lengthy inputs. SoftMax, for these like myself who’re new to the idea, are the transformation of enter numerals into chances including as much as 1, in order that the LLM can approximate which that means of the enter is likeliest.

MiniMax has rebuilt its coaching and inference frameworks to help the Lightning Consideration structure. Key enhancements embrace:

  • MoE all-to-all communication optimization: Reduces inter-GPU communication overhead.
  • Varlen ring consideration: Minimizes computational waste for long-sequence processing.
  • Environment friendly kernel implementations: Tailor-made CUDA kernels enhance Lightning Consideration efficiency.
See also  Microsoft unveils method to detect sleeper agent backdoors

These developments make MiniMax-01 fashions accessible for real-world functions, whereas sustaining affordability.

Efficiency and Benchmarks

On mainstream textual content and multi-modal benchmarks, MiniMax-01 rivals top-tier fashions like GPT-4 and Claude-3.5, with particularly sturdy outcomes on long-context evaluations. Notably, MiniMax-Textual content-01 achieved 100% accuracy on the Needle-In-A-Haystack task with a 4-million-token context.

The fashions additionally reveal minimal efficiency degradation as enter size will increase.

MiniMax plans common updates to broaden the fashions’ capabilities, together with code and multi-modal enhancements.

The corporate views open-sourcing as a step towards constructing foundational AI capabilities for the evolving AI agent panorama.

With 2025 predicted to be a transformative yr for AI brokers, the necessity for sustained reminiscence and environment friendly inter-agent communication is growing. MiniMax’s improvements are designed to fulfill these challenges.

Open to collaboration

MiniMax invitations builders and researchers to discover the capabilities of MiniMax-01. Past open-sourcing, its group welcomes technical strategies and collaboration inquiries at mannequin@minimaxi.com.

With its dedication to cost-effective and scalable AI, MiniMax positions itself as a key participant in shaping the AI agent period. The MiniMax-01 sequence provides an thrilling alternative for builders to push the boundaries of what long-context AI can obtain.


Source link
TAGGED: context, LLM, MiniMax, Open, source, staggering, Token, unveils
Share This Article
Twitter Email Copy Link Print
Previous Article AI hyperscaler, Nscale, to invest $2.5 billion in the UK data centre industry AI hyperscaler, Nscale, to invest $2.5 billion in the UK data centre industry
Next Article Wyze Descriptive Alerts, describing three babies trying to escape their cribs. Wyze cameras will use AI to describe what they see
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Your Trusted Source for Accurate and Timely Updates!

Our commitment to accuracy, impartiality, and delivering breaking news as it happens has earned us the trust of a vast audience. Stay ahead with real-time updates on the latest events, trends.
FacebookLike
TwitterFollow
InstagramFollow
YoutubeSubscribe
LinkedInFollow
MediumFollow
- Advertisement -
Ad image

Popular Posts

A-Gas joins the European Data Centre Association

A-Gasoline, a world firm specialising in Lifecycle Refrigerant Administration (LRM), has formally joined the European…

May 30, 2025

Ontology is the real guardrail: How to stop AI agents from misunderstanding your business

Enterprises are investing billions of {dollars} in AI brokers and infrastructure to rework enterprise processes.…

November 30, 2025

Google DeepMind makes AI history with gold medal win at world’s toughest math competition

Need smarter insights in your inbox? Join our weekly newsletters to get solely what issues…

July 22, 2025

The opportunities and challenges of AI for global energy

The International Energy Agency (IEA) has explored the alternatives and challenges led to by AI…

April 11, 2025

Moonvalley’s Marey is a state-of-the-art AI video model trained on FULLY LICENSED data

Be part of our every day and weekly newsletters for the most recent updates and…

March 16, 2025

You Might Also Like

STL launches Neuralis data centre connectivity suite in the U.S.
AI & Compute

STL launches Neuralis data centre connectivity suite in the U.S.

By saad
What is optical interconnect and why Lightelligence's $10B debut says it matters for AI
AI & Compute

What is optical interconnect and why Lightelligence’s $10B debut says it matters for AI

By saad
IBM launches AI platform Bob to regulate SDLC costs
AI & Compute

IBM launches AI platform Bob to regulate SDLC costs

By saad
The evolution of encoders: From simple models to multimodal AI
AI & Compute

The evolution of encoders: From simple models to multimodal AI

By saad

About Us

Data Center News is your dedicated source for data center infrastructure, AI compute, cloud, and industry news.

Top Categories

  • AI & Compute
  • Cloud Computing
  • Power & Cooling
  • Colocation
  • Security
  • Infrastructure
  • Sustainability
  • Industry News

Useful Links

  • Home
  • Contact
  • Privacy Policy
  • Terms & Conditions

Find Us on Socials

© 2026 Data Center News. All Rights Reserved.

© 2026 Data Center News. All Rights Reserved.
Welcome Back!

Sign in to your account

Lost your password?
We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will assume that you are happy with it.
You can revoke your consent any time using the Revoke consent button.