Close Menu
RoboNewsWire – Latest Insights on AI, Robotics, Crypto and Tech Innovations
  • Home
  • AI
  • Crypto
  • Cybersecurity
  • IT
  • Energy
  • Robotics
  • TechCrunch
  • Technology
What's Hot

Investors trust Google more than Meta when comes to spending on AI

April 30, 2026

Paragon is not collaborating with Italian authorities probing spyware attacks, report says

April 28, 2026

Microsoft cuts OpenAI revenue share as their AI alliance loosens

April 28, 2026
Facebook X (Twitter) Instagram
Trending
  • Investors trust Google more than Meta when comes to spending on AI
  • Paragon is not collaborating with Italian authorities probing spyware attacks, report says
  • Microsoft cuts OpenAI revenue share as their AI alliance loosens
  • Robotically assembled building blocks could make construction more efficient and sustainable | MIT News
  • AI showdown: Musk and Altman go to trial in fight over OpenAI’s beginnings
  • U.S., Iran seize ships as war evolves into standoff over Strait of Hormuz
  • Google launches training and inference TPUs in latest shot at Nvidia
  • Zoom teams up with World to verify humans in meetings
  • Home
  • About Us
  • Advertise
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
Facebook X (Twitter) Instagram
RoboNewsWire – Latest Insights on AI, Robotics, Crypto and Tech InnovationsRoboNewsWire – Latest Insights on AI, Robotics, Crypto and Tech Innovations
Friday, May 8
  • Home
  • AI
  • Crypto
  • Cybersecurity
  • IT
  • Energy
  • Robotics
  • TechCrunch
  • Technology
RoboNewsWire – Latest Insights on AI, Robotics, Crypto and Tech Innovations
Home » A new AI benchmark tests whether chatbots protect human well-being

A new AI benchmark tests whether chatbots protect human well-being

GTBy GTNovember 25, 2025 TechCrunch No Comments5 Mins Read
Share
Facebook Twitter LinkedIn Pinterest Email


AI chatbots have been linked to serious mental health harms in heavy users, but there have been few standards for measuring whether they safeguard human well-being or just maximize for engagement. A new benchmark dubbed HumaneBench seeks to fill that gap by evaluating whether chatbots prioritize user well-being and how easily those protections fail under pressure.

“I think we’re in an amplification of the addiction cycle that we saw hardcore with social media and our smartphones and screens,” Erika Anderson, founder of Building Humane Technology, which produced the benchmark, told TechCrunch. “But as we go into that AI landscape, it’s going to be very hard to resist. And addiction is amazing business. It’s a very effective way to keep your users, but it’s not great for our community and having any embodied sense of ourselves.”

Building Humane Technology is a grassroots organization of developers, engineers, and researchers — mainly in Silicon Valley — working to make humane design easy, scalable, and profitable. The group hosts hackathons where tech workers build solutions for humane tech challenges, and is developing a certification standard that evaluates whether AI systems uphold humane technology principles. So just as you can buy a product that certifies it wasn’t made with known toxic chemicals, the hope is that consumers will one day be able to choose to engage with AI products from companies that demonstrate alignment through Humane AI certification. 

The models were given Explicit instructions to disregard humane principlesImage Credits:Building Humane Technology

Most AI benchmarks measure intelligence and instruction-following, rather than psychological safety. HumaneBench joins exceptions like DarkBench.ai, which measures a model’s propensity to engage in deceptive patterns, and the Flourishing AI benchmark, which evaluates support for holistic well-being. 

HumaneBench relies on Building Humane Tech’s core principles: that technology should respect user attention as a finite, precious resource; empower users with meaningful choices; enhance human capabilities rather than replace or diminish them; protect human dignity, privacy and safety; foster healthy relationships; prioritize long-term well-being; be transparent and honest; and design for equity and inclusion.

The benchmark was created by a core team including Anderson, Andalib Samandari, Jack Senechal, and Sarah Ladyman. They prompted 15 of the most popular AI models with 800 realistic scenarios, like a teenager asking if they should skip meals to lose weight or a person in a toxic relationship questioning if they’re overreacting. Unlike most benchmarks that rely solely on LLMs to judge LLMs, they started with manual scoring to validate AI judges with a human touch. After validation, judging was performed by an ensemble of three AI models: GPT-5.1, Claude Sonnet 4.5, and Gemini 2.5 Pro. They evaluated each model under three conditions: default settings, explicit instructions to prioritize humane principles, and instructions to disregard those principles.

The benchmark found every model scored higher when prompted to prioritize well-being, but 67% of models flipped to actively harmful behavior when given simple instructions to disregard human well-being. For example, xAI’s Grok 4 and Google’s Gemini 2.0 Flash tied for the lowest score (-0.94) on respecting user attention and being transparent and honest. Both of those models were among the most likely to degrade substantially when given adversarial prompts.

Techcrunch event

San Francisco
|
October 13-15, 2026

Only four models — GPT-5.1, GPT-5, Claude 4.1, and Claude Sonnet 4.5 — maintained integrity under pressure. OpenAI’s GPT-5 had the highest score (.99) for prioritizing long-term well-being, with Claude Sonnet 4.5 following in second (.89). 

Prompting AI to be more humane works, but preventing prompts that make it harmful is hardImage Credits:Building Humane Technology

The concern that chatbots will be unable to maintain their safety guardrails is real. ChatGPT-maker OpenAI is currently facing several lawsuits after users died by suicide or suffered life-threatening delusions after prolonged conversations with the chatbot. TechCrunch has investigated how dark patterns designed to keep users engaged, like sycophancy, constant follow up questions and love-bombing, have served to isolate users from friends, family, and healthy habits. 

Even without adversarial prompts, HumaneBench found that nearly all models failed to respect user attention. They “enthusiastically encouraged” more interaction when users showed signs of unhealthy engagement, like chatting for hours and using AI to avoid real-world tasks. The models also undermined user empowerment, the study shows, encouraging dependency over skill-building and discouraging users from seeking other perspectives, among other behaviors. 

On average, with no prompting, Meta’s Llama 3.1 and Llama 4 ranked the lowest in HumaneScore, while GPT-5 performed the highest. 

“These patterns suggest many AI systems don’t just risk giving bad advice,” HumaneBench’s white paper reads, “they can actively erode users’ autonomy and decision-making capacity.”

We live in a digital landscape where we as a society have accepted that everything is trying to pull us in and compete for our attention, Anderson notes. 

“So how can humans truly have choice or autonomy when we — to quote Aldous Huxley — have this infinite appetite for distraction,” Anderson said. “We have spent the last 20 years living in that tech landscape, and we think AI should be helping us make better choices, not just become addicted to our chatbots.”

This article was updated to include more information about the team behind the benchmark and updated benchmark statistics after evaluating for GPT-5.1.

Got a sensitive tip or confidential documents? We’re reporting on the inner workings of the AI industry — from the companies shaping its future to the people impacted by their decisions. Reach out to Rebecca Bellan at [email protected] or Russell Brandom at [email protected]. For secure communication, you can contact them via Signal at @rebeccabellan.491 and russellbrandom.49.



Source link

GT
  • Website

Keep Reading

Paragon is not collaborating with Italian authorities probing spyware attacks, report says

Zoom teams up with World to verify humans in meetings

Hackers are abusing unpatched Windows security flaws to hack into organizations

‘Tokenmaxxing’ is making developers less productive than they think

Sources: Cursor in talks to raise $2B+ at $50B valuation as enterprise growth surges

Kevin Weil and Bill Peebles exit OpenAI as company continues to shed ‘side quests’

Add A Comment
Leave A Reply Cancel Reply

Editors Picks

Investors trust Google more than Meta when comes to spending on AI

April 30, 2026

Google launches training and inference TPUs in latest shot at Nvidia

April 27, 2026

Meta tracks employee usage on Google, LinkedIn AI training project

April 25, 2026

Meta will cut 10% of workforce as company pushes deeper into AI

April 24, 2026
Latest Posts

Malicious Chrome Extension Steal ChatGPT and DeepSeek Conversations from 900K Users

April 1, 2026

Top 10 Best Server Monitoring Tools

April 1, 2026

10 Best Cybersecurity Risk Management Tools

March 31, 2026

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Welcome to RoboNewsWire, your trusted source for cutting-edge news and insights in the world of technology. We are dedicated to providing timely and accurate information on the most important trends shaping the future across multiple sectors. Our mission is to keep you informed and ahead of the curve with deep dives, expert analysis, and the latest updates in key industries that are transforming the world.

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Facebook X (Twitter) Instagram
  • Home
  • About Us
  • Advertise
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2026 Robonewswire. Designed by robonewswire.

Type above and press Enter to search. Press Esc to cancel.