Friday, 1 May 2026
Subscribe
logo
  • AI Compute
  • Infrastructure
  • Power & Cooling
  • Security
  • Colocation
  • Cloud Computing
  • More
    • Sustainability
    • Industry News
    • About Data Center News
    • Terms & Conditions
Font ResizerAa
Data Center NewsData Center News
Search
  • AI Compute
  • Infrastructure
  • Power & Cooling
  • Security
  • Colocation
  • Cloud Computing
  • More
    • Sustainability
    • Industry News
    • About Data Center News
    • Terms & Conditions
Have an existing account? Sign In
Follow US
© 2022 Foxiz News Network. Ruby Design Company. All Rights Reserved.
Data Center News > Blog > AI & Compute > Tencent releases versatile open-source Hunyuan AI models
AI & Compute

Tencent releases versatile open-source Hunyuan AI models

Last updated: August 4, 2025 7:56 pm
Published August 4, 2025
Share
Tencent releases versatile open-source Hunyuan AI models
SHARE

Tencent has expanded its household of open-source Hunyuan AI fashions which are versatile sufficient for broad use. This new household of fashions is engineered to ship highly effective efficiency throughout computational environments, from small edge units to demanding, high-concurrency manufacturing methods.

The discharge features a complete set of pre-trained and instruction-tuned fashions out there on the developer platform Hugging Face. The fashions are available in a number of sizes, particularly with parameter scales of 0.5B, 1.8B, 4B, and 7B, offering substantial flexibility for builders and companies.

Tencent has indicated that these fashions have been developed utilizing coaching methods much like its extra highly effective Hunyuan-A13B mannequin, permitting them to inherit its efficiency traits. This method allows customers to pick the optimum mannequin for his or her wants, whether or not it’s a smaller variant for resource-constrained edge computing or a bigger mannequin for high-throughput manufacturing workloads, all whereas making certain robust capabilities.

Probably the most notable options of the Hunyuan sequence is its native assist for an ultra-long 256K context window. This enables the fashions to deal with and keep steady efficiency on long-text duties, a significant functionality for complicated doc evaluation, prolonged conversations, and in-depth content material technology. The fashions assist what Tencent calls “hybrid reasoning,” which permits for each quick and sluggish pondering modes that customers can select between relying on their particular necessities.

The corporate has additionally positioned a robust emphasis on agentic capabilities. The fashions have been optimised for agent-based duties and have demonstrated main outcomes on established benchmarks corresponding to BFCL-v3, τ-Bench, and C3-Bench, suggesting a excessive diploma of proficiency in complicated, multi-step problem-solving. As an illustration, on the C3-Bench, the Hunyuan-7B-Instruct mannequin achieves a rating of 68.5, whereas the Hunyuan-4B-Instruct mannequin scores 64.3.

See also  Lightweight LLM powers Japanese enterprise AI deployments

The sequence’ efficiency is a deal with environment friendly inference. Tencent’s Hunyuan fashions utilise Grouped Question Consideration (GQA), a method identified for enhancing processing velocity and decreasing computational overhead. This effectivity is additional enhanced by superior quantisation assist, a key component of the Hunyuan structure designed to decrease deployment limitations.

Tencent has developed its personal compression toolset, AngleSlim, to create a extra user-friendly and efficient mannequin compression resolution. Utilizing this instrument, the corporate affords two most important varieties of quantisation for the Hunyuan sequence.

The primary is FP8 static quantisation, which employs an 8-bit floating-point format. This methodology makes use of a small quantity of calibration information to pre-determine the quantisation scale with out requiring full retraining, changing mannequin weights and activation values into the FP8 format to spice up inference effectivity.

The second methodology is INT4 quantisation, which achieves W4A16 quantisation by the GPTQ and AWQ algorithms:

  • The GPTQ method processes mannequin weights layer by layer, utilizing calibration information to minimise errors within the quantised weights. This course of avoids requiring mannequin retraining and improves inference velocity.
  • The AWQ algorithm works by statistically analysing the amplitude of activation values from a small set of calibration information. It then calculates a scaling coefficient for every weight channel, which expands the numerical vary of vital weights to retain extra data through the compression course of. 

Builders can both use the AngleSlim instrument themselves or obtain the pre-quantised fashions instantly.

Efficiency benchmarks verify the robust capabilities of the Tencent Hunyuan fashions throughout a variety of duties. The pre-trained Hunyuan-7B mannequin, for instance, achieves a rating of 79.82 on the MMLU benchmark, 88.25 on GSM8K, and 74.85 on the MATH benchmark, demonstrating stable reasoning and mathematical expertise.

See also  How to Select the Right Cloud GPU Instance for Deploying AI Models

The instruction-tuned variants present spectacular leads to specialised areas. In arithmetic, the Hunyuan-7B-Instruct mannequin scores 81.1 on the AIME 2024 benchmark, whereas the 4B model scores 78.3. In science, the 7B mannequin reaches 76.5 on OlympiadBench, and in coding, it scores 42 on Livecodebench.

🚀We’re increasing the Tencent Hunyuan open-source LLM ecosystem with 4 compact fashions (0.5B, 1.8B, 4B, 7B)! Designed for low-power eventualities like consumer-grade GPUs, good automobiles, good residence units, cell phones, and PCs, these fashions assist cost-effective fine-tuning… pic.twitter.com/CknskVqPem

— Hunyuan (@TencentHunyuan) August 4, 2025

The quantisation benchmarks present minimal efficiency degradation. On the DROP benchmark, the Hunyuan-7B-Instruct mannequin scores 85.9 in its base B16 format, 86.0 with FP8, and 85.7 with Int4 GPTQ, indicating that effectivity beneficial properties don’t come at a price to accuracy.

For deployment, Tencent recommends utilizing established frameworks like TensorRT-LLM, vLLM, or SGLang to serve the Hunyuan fashions and create OpenAI-compatible API endpoints, making certain they are often built-in easily into current improvement workflows. This mix of efficiency, effectivity, and deployment flexibility positions the Hunyuan sequence as a seamless highly effective contender in open-source AI.

See additionally: Deep Cogito v2: Open-source AI that hones its reasoning expertise

Need to be taught extra about AI and large information from trade leaders? Take a look at AI & Big Data Expo going down in Amsterdam, California, and London. The excellent occasion is co-located with different main occasions together with Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Discover different upcoming enterprise know-how occasions and webinars powered by TechForge here.

See also  Google launches production-ready Gemini 2.5 AI models to challenge OpenAI's enterprise dominance



Source link

TAGGED: Hunyuan, models, opensource, releases, Tencent, Versatile
Share This Article
Twitter Email Copy Link Print
Previous Article Hyve Managed Hosting partners with Digital Realty to expand global operations Hyve Managed Hosting partners with Digital Realty to expand global operations
Next Article ChatGPT rockets to 700M weekly users ahead of GPT-5 launch with reasoning superpowers ChatGPT rockets to 700M weekly users ahead of GPT-5 launch with reasoning superpowers
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Your Trusted Source for Accurate and Timely Updates!

Our commitment to accuracy, impartiality, and delivering breaking news as it happens has earned us the trust of a vast audience. Stay ahead with real-time updates on the latest events, trends.
FacebookLike
TwitterFollow
InstagramFollow
YoutubeSubscribe
LinkedInFollow
MediumFollow
- Advertisement -
Ad image

Popular Posts

Ethically trained AI startup Pleias releases new small reasoning models optimized for RAG with built-in citations

Be a part of our each day and weekly newsletters for the newest updates and…

April 26, 2025

Is that really your boss calling? Jericho Security raises $15M to stop deepfake fraud that’s cost businesses $200M in 2025 alone

Be a part of our day by day and weekly newsletters for the newest updates…

April 24, 2025

Knight Frank unveils 2025 Global Data Centre Forecast

The report identifies synthetic intelligence (AI), hybrid cloud adoption, and the relentless demand for knowledge…

February 4, 2025

AI investment has dual nature

AI’s rising dominance on this planet, whether or not or not it's reshaping industries’ workflows…

August 26, 2025

SITE develops financial data infrastructure in Qatar

Safe I.T. Environments Ltd (SITE), a UK supplier of modular and micro knowledge centre options,…

March 4, 2026

You Might Also Like

STL launches Neuralis data centre connectivity suite in the U.S.
AI & Compute

STL launches Neuralis data centre connectivity suite in the U.S.

By saad
What is optical interconnect and why Lightelligence's $10B debut says it matters for AI
AI & Compute

What is optical interconnect and why Lightelligence’s $10B debut says it matters for AI

By saad
IBM launches AI platform Bob to regulate SDLC costs
AI & Compute

IBM launches AI platform Bob to regulate SDLC costs

By saad
The evolution of encoders: From simple models to multimodal AI
AI & Compute

The evolution of encoders: From simple models to multimodal AI

By saad

About Us

Data Center News is your dedicated source for data center infrastructure, AI compute, cloud, and industry news.

Top Categories

  • AI & Compute
  • Cloud Computing
  • Power & Cooling
  • Colocation
  • Security
  • Infrastructure
  • Sustainability
  • Industry News

Useful Links

  • Home
  • Contact
  • Privacy Policy
  • Terms & Conditions

Find Us on Socials

© 2026 Data Center News. All Rights Reserved.

© 2026 Data Center News. All Rights Reserved.
Welcome Back!

Sign in to your account

Lost your password?
We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will assume that you are happy with it.
You can revoke your consent any time using the Revoke consent button.