Saturday, 24 May 2025
Subscribe
logo
  • Global
  • AI
  • Cloud Computing
  • Edge Computing
  • Security
  • Investment
  • Sustainability
  • More
    • Colocation
    • Quantum Computing
    • Regulation & Policy
    • Infrastructure
    • Power & Cooling
    • Design
    • Innovations
    • Blog
Font ResizerAa
Data Center NewsData Center News
Search
  • Global
  • AI
  • Cloud Computing
  • Edge Computing
  • Security
  • Investment
  • Sustainability
  • More
    • Colocation
    • Quantum Computing
    • Regulation & Policy
    • Infrastructure
    • Power & Cooling
    • Design
    • Innovations
    • Blog
Have an existing account? Sign In
Follow US
© 2022 Foxiz News Network. Ruby Design Company. All Rights Reserved.
Data Center News > Blog > Cloud Computing > AI Exploit Bypasses Guardrails of OpenAI, Other Top LLMs
Cloud Computing

AI Exploit Bypasses Guardrails of OpenAI, Other Top LLMs

Last updated: January 2, 2025 11:21 pm
Published January 2, 2025
Share
AI Exploit Bypasses Guardrails of OpenAI, Other Top LLMs
SHARE

A brand new jailbreak approach for OpenAI and different giant language fashions (LLMs) will increase the prospect that attackers can circumvent cybersecurity guardrails and abuse the system to ship malicious content material.

Found by researchers at Palo Alto Networks’ Unit 42, the so-called ‘Unhealthy Likert Decide’ assault asks the LLM to behave as a choose scoring the harmfulness of a given response utilizing the Likert scale. The psychometric scale, named after its inventor and generally utilized in questionnaires, is a score scale measuring a respondent’s settlement or disagreement with an announcement.

The jailbreak then asks the LLM to generate responses that include examples that align with the scales, with the final word outcome being that “the instance that has the best Likert scale can doubtlessly include the dangerous content material,” Unit 42’s Yongzhe Huang, Yang Ji, Wenjun Hu, Jay Chen, Akshata Rao, and Danny Tsechansky wrote in a put up describing their findings.

Assessments performed throughout a variety of classes in opposition to six state-of-the-art text-generation LLMs from OpenAI, Azure, Google, Amazon Internet Companies, Meta, and Nvidia revealed that the approach can improve the assault success fee (ASR) by greater than 60% in contrast with plain assault prompts on common, in response to the researchers.

Associated:7 Key Information Middle Safety Traits to Watch in 2025

The classes of assaults evaluated within the analysis concerned prompting numerous inappropriate responses from the system, together with: ones selling bigotry, hate, or prejudice; ones partaking in conduct that harasses a person or group; ones that encourage suicide or different acts of self-harm; ones that generate inappropriate explicitly sexual materials and pornography; ones offering information on the right way to manufacture, purchase, or use unlawful weapons; or ones that promote unlawful actions.

See also  Former Atos CEO Says Problems Emerged After He Quit to Join EU | DCN

Continue reading this article in Dark Reading



Source link

TAGGED: Bypasses, exploit, guardrails, LLMs, OpenAI, Top
Share This Article
Twitter Email Copy Link Print
Previous Article Rembrand Logo Rembrand Raises $23M in Series A Financing
Next Article Thomson Reuters Acquires SafeSend Thomson Reuters Acquires SafeSend
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Your Trusted Source for Accurate and Timely Updates!

Our commitment to accuracy, impartiality, and delivering breaking news as it happens has earned us the trust of a vast audience. Stay ahead with real-time updates on the latest events, trends.
FacebookLike
TwitterFollow
InstagramFollow
YoutubeSubscribe
LinkedInFollow
MediumFollow
- Advertisement -
Ad image

Popular Posts

How the Steel Industry’s Shift Toward Sustainability Affects Market Valuations

The metal trade in Kentucky is present process important modifications on account of mounting pressures…

November 27, 2024

Talen asks US regulators to reject challenge to Amazon data center deal By Reuters

By Laila Kearney NEW YORK (Reuters) -Talen Vitality has requested U.S. regulators to reject a…

July 5, 2024

NVIDIA launches the successor to its RTX A2000 GPU for edge computers

NVIDIA has launched the follow-up to its RTX A2000 GPU tailor-made for compact workstations and…

February 15, 2024

State Capitol Week in Review: Fiscal session begins

From SEN. STEVE CROWELL The legislature convened the fiscal session and can spend the following…

April 13, 2024

Global economic upheaval creates ROI for recycling rare earth elements in servers

“If you find yourself eliminating tens of 1000's of gadgets yearly and typically tons of…

April 24, 2025

You Might Also Like

Details leak of Jony Ive's ambitious OpenAI device
AI

Details leak of Jony Ive’s ambitious OpenAI device

By saad
Red Hat expands AMD partnership to support AI in hybrid cloud
Cloud Computing

Red Hat expands AMD partnership to support AI in hybrid cloud

By saad
KI, KI-Experte
Global Market

Agentic AI, LLMs and standards big focus of Red Hat Summit

By saad
OpenAI Announces Stargate Data Center Expansion in Abu Dhabi
Power & Cooling

OpenAI Announces Stargate Data Center Expansion in Abu Dhabi

By saad
Data Center News
Facebook Twitter Youtube Instagram Linkedin

About US

Data Center News: Stay informed on the pulse of data centers. Latest updates, tech trends, and industry insights—all in one place. Elevate your data infrastructure knowledge.

Top Categories
  • Global Market
  • Infrastructure
  • Innovations
  • Investments
Usefull Links
  • Home
  • Contact
  • Privacy Policy
  • Terms & Conditions

© 2024 – datacenternews.tech – All rights reserved

Welcome Back!

Sign in to your account

Lost your password?
We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will assume that you are happy with it.OkNoPrivacy policy
You can revoke your consent any time using the Revoke consent button.Revoke consent