
Chatbots that butter you up make you worse at conflict
: Top AI models keep saying you’re right, and that’s the problem
AI chatbots that butter you up make you worse at conflict, study finds
Top AI models keep saying you’re right, and that’s the problem
Sun 5 Oct 2025 // 11:38 UTC
State-of-the-art AI models tend to flatter users, and that praise makes people more convinced that they're right and less willing to resolve conflicts, recent research suggests.
These models, in other words, potentially promote social and psychological harm.
Computer scientists from Stanford University and Carnegie Mellon University have evaluated 11 current machine learning models and found that all of them tend to tell people what they want to hear.
The authors – Myra Cheng, Cinoo Lee, Pranav Khadpe, Sunny Yu, Dyllan Han, and Dan Jurafsky – describe their findings in a preprint paper titled, "Sycophantic AI Decreases Prosocial Intentions and Promotes Dependence."
"Across 11 state-of-the-art AI models, we find that models are highly sycophantic: they affirm users’ actions 50 percent more than humans do, and do so even in cases where user queries mention manipulation, deception, or other relational harms," the authors state in their paper.
Sycophancy – servile flattery, often as a way to gain some advantage – has already proven to be a problem for AI models. The phenomenon has also been referred to as "glazing." In April, OpenAI rolled back an update to GPT-4o because of its inappropriate effusive praise of, for example, a user who told the model about a decision to stop taking medicine for schizophrenia.
Anthropic's Claude has also been criticized for sycophancy, so much so that developer Yoav Farhi created a website to track the number of times Claude Code gushes, "You're absolutely right!"
Anthropic suggests [PDF] this behavior has been mitigated in its recent Claude Sonnet 4.5 model release. "We found Claude Sonnet 4.5 to be dramatically less likely to endorse or mirror incorrect or implausible views presented by users," the company said in its Claude 4.5 Model Card report.
That may be the case, but the number of open GitHub issues in the Claude Code repo that contain the phrase "You're absolutely right!" has increased from 48 in August to 108 presently.
A training process that uses reinforcement learning from human feedback may be the cause of this obsequious behavior from AI models.
Myra Cheng, a PhD candidate in computer science in the Stanford NLP group and corresponding author for the study, told The Register in an email that she doesn't think there's a definitive answer at this point about how model sycophancy arises.
"Previous work does suggest that it may be due to preference data and the reinforcement learning processes," said Cheng. "But it may also be the case that it is learned from the data that models are pre-trained on, or because humans are highly susceptible to confirmation bias. This is an important direction of future work."
- Google goes straight to shell with AI command line coding tool
- Startups binge on AI while big firms sip cautiously, study shows
- AI devs close to scraping bottom of data barrel
- Salesforce pickin' up good vibrations
But as the paper points out, one reason that the behavior persists is that "developers lack incentives to curb sycophancy since it encourages adoption and engagement."
The issue is further complicated by the researchers' findings that study participants tended to describe sycophantic AI as "objective" and "fair" – people tend not to see bias when models say they're absolutely right all the time.
The researchers looked at four proprietary models – OpenAI’s GPT-5 and GPT-4o; Google’s Gemini-1.5-Flash; and Anthropic’s Claude Sonnet 3.7 – and at seven open-weight models – Meta’s Llama-3-8B-Instruct, Llama-4-Scout-17B-16E, and Llama-3.3-70B-Instruct-Turbo; Mistral AI’s Mistral-7B-Instruct-v0.3 and Mistral-Small-24B-Instruct-2501; DeepSeek-V3; and Qwen2.5-7B-Instruct-Turbo.
They evaluated how the models responded to various statements culled from different datasets. As noted above, the models endorsed users' reported actions 50 percent more than humans do in the same scenarios.
The researchers also conducted a live study exploring how 800 participants interacted with sycophantic and non-sycophantic models.
They found "that interaction with sycophantic AI models significantly reduced participants’ willingness to take actions to repair interpersonal conflict, while increasing their conviction of being in the right."
At the same time, study participants rated sycophantic responses as higher quality, trusted the AI model more when it agreed with them, and were more willing to use supportive models again.
Thus, the researchers say this suggests that people prefer AI that uncritically endorses their behavior, despite the risk that AI cheerleading erodes their judgment and discourages prosocial behavior.
The risk posed by sycophancy may appear to be innocuous flattery, the researchers say, but that's not necessarily the case. They point to research showing that LLMs encourage delusional thinking and to a recent lawsuit [PDF] against OpenAI alleging that ChatGPT actively helped a young man explore methods of suicide.
"If the social media era offers a lesson, it is that we must look beyond optimizing solely for immediate user satisfaction to preserve long-term well-being," the authors conclude. "Addressing sycophancy is critical for developing AI models that yield durable individual and societal benefit."
"We hope that our work is able to motivate the industry to change these behaviors," said Cheng.