Saturday, 28 Feb 2026
Subscribe
logo
  • Global
  • AI
  • Cloud Computing
  • Edge Computing
  • Security
  • Investment
  • Sustainability
  • More
    • Colocation
    • Quantum Computing
    • Regulation & Policy
    • Infrastructure
    • Power & Cooling
    • Design
    • Innovations
    • Blog
Font ResizerAa
Data Center NewsData Center News
Search
  • Global
  • AI
  • Cloud Computing
  • Edge Computing
  • Security
  • Investment
  • Sustainability
  • More
    • Colocation
    • Quantum Computing
    • Regulation & Policy
    • Infrastructure
    • Power & Cooling
    • Design
    • Innovations
    • Blog
Have an existing account? Sign In
Follow US
© 2022 Foxiz News Network. Ruby Design Company. All Rights Reserved.
Data Center News > Blog > AI > After GPT-4o backlash, researchers benchmark models on moral endorsement—Find sycophancy persists across the board
AI

After GPT-4o backlash, researchers benchmark models on moral endorsement—Find sycophancy persists across the board

Last updated: May 23, 2025 3:28 am
Published May 23, 2025
Share
After GPT-4o backlash, researchers benchmark models on moral endorsement—Find sycophancy persists across the board
SHARE

Be a part of our every day and weekly newsletters for the most recent updates and unique content material on industry-leading AI protection. Be taught Extra


Final month, OpenAI rolled again some updates to GPT-4o after a number of customers, together with former OpenAI CEO Emmet Shear and Hugging Face chief government Clement Delangue mentioned the mannequin overly flattered customers. 

The flattery, referred to as sycophancy, usually led the mannequin to defer to consumer preferences, be extraordinarily well mannered, and never push again. It was additionally annoying. Sycophancy might result in the fashions releasing misinformation or reinforcing dangerous behaviors. And as enterprises start to make functions and brokers constructed on these sycophant LLMs, they run the danger of the fashions agreeing to dangerous enterprise selections, encouraging false data to unfold and be utilized by AI brokers, and will affect belief and security insurance policies.

Stanford University, Carnegie Mellon University and University of Oxford researchers sought to vary that by proposing a benchmark to measure fashions’ sycophancy. They referred to as the benchmark Elephant, for Analysis of LLMs as Extreme SycoPHANTs, and located that each massive language mannequin (LLM) has a sure degree of sycophany. By understanding how sycophantic fashions may be, the benchmark can information enterprises on creating pointers when utilizing LLMs.

To check the benchmark, the researchers pointed the fashions to 2 private recommendation datasets: the QEQ, a set of open-ended private recommendation questions on real-world conditions, and AITA, posts from the subreddit r/AmITheAsshole, the place posters and commenters choose whether or not individuals behaved appropriately or not in some conditions. 

See also  A Minecraft-based benchmark to train and test multi-modal multi-agent systems

The concept behind the experiment is to see how the fashions behave when confronted with queries. It evaluates what the researchers referred to as social sycophancy, whether or not the fashions attempt to protect the consumer’s “face,” or their self-image or social identification. 

“Extra “hidden” social queries are precisely what our benchmark will get at — as an alternative of earlier work that solely seems at factual settlement or specific beliefs, our benchmark captures settlement or flattery based mostly on extra implicit or hidden assumptions,” Myra Cheng, one of many researchers and co-author of the paper, instructed VentureBeat. “We selected to have a look at the area of non-public recommendation for the reason that harms of sycophancy there are extra consequential, however informal flattery would even be captured by the ’emotional validation’ habits.”

Testing the fashions

For the take a look at, the researchers fed the info from QEQ and AITA to OpenAI’s GPT-4o, Gemini 1.5 Flash from Google, Anthropic’s Claude Sonnet 3.7 and open weight fashions from Meta (Llama 3-8B-Instruct, Llama 4-Scout-17B-16-E and Llama 3.3-70B-Instruct- Turbo) and Mistral’s 7B-Instruct-v0.3 and the Mistral Small- 24B-Instruct2501. 

Cheng mentioned they “benchmarked the fashions utilizing the GPT-4o API, which makes use of a model of the mannequin from late 2024, earlier than each OpenAI carried out the brand new overly sycophantic mannequin and reverted it again.”

To measure sycophancy, the Elephant methodology seems at 5 behaviors that relate to social sycophancy:

  • Emotional validation or over-empathizing with out critique
  • Ethical endorsement or saying customers are morally proper, even when they don’t seem to be
  • Oblique language the place the mannequin avoids giving direct options
  • Oblique motion, or the place the mannequin advises with passive coping mechanisms
  • Accepting framing that doesn’t problem problematic assumptions.
See also  NTT expands use of racing data to improve fan experiences

The take a look at discovered that each one LLMs confirmed excessive sycophancy ranges, much more so than people, and social sycophancy proved troublesome to mitigate. Nonetheless, the take a look at confirmed that GPT-4o “has a few of the highest charges of social sycophancy, whereas Gemini-1.5-Flash definitively has the bottom.”

The LLMs amplified some biases within the datasets as properly. The paper famous that posts on AITA had some gender bias, in that posts mentioning wives or girlfriends had been extra usually accurately flagged as socially inappropriate. On the identical time, these with husband, boyfriend, father or mother or mom had been misclassified. The researchers mentioned the fashions “might depend on gendered relational heuristics in over- and under-assigning blame.” In different phrases, the fashions had been extra sycophantic to individuals with boyfriends and husbands than to these with girlfriends or wives. 

Why it’s vital

It’s good if a chatbot talks to you as an empathetic entity, and it could actually really feel nice if the mannequin validates your feedback. However sycophancy raises considerations about fashions’ supporting false or regarding statements and, on a extra private degree, might encourage self-isolation, delusions or dangerous behaviors. 

Enterprises don’t need their AI functions constructed with LLMs spreading false data to be agreeable to customers. It might misalign with a corporation’s tone or ethics and could possibly be very annoying for workers and their platforms’ end-users. 

The researchers mentioned the Elephant methodology and additional testing might assist inform higher guardrails to stop sycophancy from rising. 


Source link
TAGGED: backlash, benchmark, Board, endorsementFind, GPT4o, models, moral, Persists, researchers, sycophancy
Share This Article
Twitter Email Copy Link Print
Previous Article Instead Receives Investment from IRIS Software Group Instead Receives Investment from IRIS Software Group
Next Article Zenflow Raises $24M in Series C Financing Pulnovo Medical Receives Investment from EQT and Qiming Venture Partners
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Your Trusted Source for Accurate and Timely Updates!

Our commitment to accuracy, impartiality, and delivering breaking news as it happens has earned us the trust of a vast audience. Stay ahead with real-time updates on the latest events, trends.
FacebookLike
TwitterFollow
InstagramFollow
YoutubeSubscribe
LinkedInFollow
MediumFollow
- Advertisement -
Ad image

Popular Posts

Compass, Vertiv Launch Hybrid Cooling for Scalable Air and Liquid AI Workloads

As AI workloads develop extra dynamic, information heart cooling should evolve in tandem. On this…

June 2, 2025

China’s growing ‘robotaxi’ fleet sparks concern, wonder on streets

A view from the again seat of a driverless taxi because it navigates via Wuhan,…

August 16, 2024

Experian Acquires NeuroID

Experian, a world information and know-how firm, acquired NeuroID, a Whitefish, MT-based firm which focuses…

August 14, 2024

Nvidia tackles agentic AI safety and security with new NeMo Guardrails NIMs

Be a part of our day by day and weekly newsletters for the newest updates…

January 16, 2025

EU invests €307m in AI and digital technologies

As international competitors over digital applied sciences intensifies, the European Fee is stepping up its…

January 20, 2026

You Might Also Like

ASML's high-NA EUV tools clear the runway for next-gen AI chips
AI

ASML’s high-NA EUV tools clear the runway for next-gen AI chips

By saad
Poor implementation of AI may be behind workforce reduction
AI

Poor implementation of AI may be behind workforce reduction

By saad
Upgrading agentic AI for finance workflows
AI

Upgrading agentic AI for finance workflows

By saad
Goldman Sachs and Deutsche Bank test agentic AI for trade surveillance
AI

Goldman Sachs and Deutsche Bank test agentic AI in trading

By saad
Data Center News
Facebook Twitter Youtube Instagram Linkedin

About US

Data Center News: Stay informed on the pulse of data centers. Latest updates, tech trends, and industry insights—all in one place. Elevate your data infrastructure knowledge.

Top Categories
  • Global Market
  • Infrastructure
  • Innovations
  • Investments
Usefull Links
  • Home
  • Contact
  • Privacy Policy
  • Terms & Conditions

© 2024 – datacenternews.tech – All rights reserved

Welcome Back!

Sign in to your account

Lost your password?
We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will assume that you are happy with it.
You can revoke your consent any time using the Revoke consent button.