Close Menu
RoboNewsWire – Latest Insights on AI, Robotics, Crypto and Tech Innovations
  • Home
  • AI
  • Crypto
  • Cybersecurity
  • IT
  • Energy
  • Robotics
  • TechCrunch
  • Technology
What's Hot

Investors trust Google more than Meta when comes to spending on AI

April 30, 2026

Paragon is not collaborating with Italian authorities probing spyware attacks, report says

April 28, 2026

Microsoft cuts OpenAI revenue share as their AI alliance loosens

April 28, 2026
Facebook X (Twitter) Instagram
Trending
  • Investors trust Google more than Meta when comes to spending on AI
  • Paragon is not collaborating with Italian authorities probing spyware attacks, report says
  • Microsoft cuts OpenAI revenue share as their AI alliance loosens
  • Robotically assembled building blocks could make construction more efficient and sustainable | MIT News
  • AI showdown: Musk and Altman go to trial in fight over OpenAI’s beginnings
  • U.S., Iran seize ships as war evolves into standoff over Strait of Hormuz
  • Google launches training and inference TPUs in latest shot at Nvidia
  • Zoom teams up with World to verify humans in meetings
  • Home
  • About Us
  • Advertise
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
Facebook X (Twitter) Instagram
RoboNewsWire – Latest Insights on AI, Robotics, Crypto and Tech InnovationsRoboNewsWire – Latest Insights on AI, Robotics, Crypto and Tech Innovations
Tuesday, May 12
  • Home
  • AI
  • Crypto
  • Cybersecurity
  • IT
  • Energy
  • Robotics
  • TechCrunch
  • Technology
RoboNewsWire – Latest Insights on AI, Robotics, Crypto and Tech Innovations
Home » OpenAI’s research on AI models deliberately lying is wild 

OpenAI’s research on AI models deliberately lying is wild 

GTBy GTSeptember 19, 2025 TechCrunch No Comments4 Mins Read
Share
Facebook Twitter LinkedIn Pinterest Email


Every now and then, researchers at the biggest tech companies drop a bombshell. There was the time Google said its latest quantum chip indicated multiple universes exist. Or when Anthropic gave its AI agent Claudius a snack vending machine to run and it went amok, calling security on people and insisting it was human.  

This week, it was OpenAI’s turn to raise our collective eyebrows.

OpenAI released on Monday some research that explained how it’s stopping AI models from “scheming.” It’s a practice in which an “AI behaves one way on the surface while hiding its true goals,” OpenAI defined in its tweet about the research.   

In the paper, conducted with Apollo Research, researchers went a bit further, likening AI scheming to a human stock broker breaking the law to make as much money as possible. The researchers, however, argued that most AI “scheming” wasn’t that harmful. “The most common failures involve simple forms of deception — for instance, pretending to have completed a task without actually doing so,” they wrote. 

The paper was mostly published to show that “deliberative alignment⁠” — the anti-scheming technique they were testing — worked well. 

But it also explained that AI developers haven’t figured out a way to train their models not to scheme. That’s because such training could actually teach the model how to scheme even better to avoid being detected. 

“A major failure mode of attempting to ‘train out’ scheming is simply teaching the model to scheme more carefully and covertly,” the researchers wrote. 

Techcrunch event

San Francisco
|
October 27-29, 2025

Perhaps the most astonishing part is that, if a model understands that it’s being tested, it can pretend it’s not scheming just to pass the test, even if it is still scheming. “Models often become more aware that they are being evaluated. This situational awareness can itself reduce scheming, independent of genuine alignment,” the researchers wrote. 

It’s not news that AI models will lie. By now most of us have experienced AI hallucinations, or the model confidently giving an answer to a prompt that simply isn’t true. But hallucinations are basically presenting guesswork with confidence, as OpenAI research released earlier this month documented. 

Scheming is something else. It’s deliberate.  

Even this revelation — that a model will deliberately mislead humans — isn’t new. Apollo Research first published a paper in December documenting how five models schemed when they were given instructions to achieve a goal “at all costs.”  

The news here is actually good news: The researchers saw significant reductions in scheming by using “deliberative alignment⁠.” That technique involves teaching the model an “anti-scheming specification” and then making the model go review it before acting. It’s a bit like making little kids repeat the rules before allowing them to play. 

OpenAI researchers insist that the lying they’ve caught with their own models, or even with ChatGPT, isn’t that serious. As OpenAI’s co-founder Wojciech Zaremba told TechCrunch’s Maxwell Zeff about this research: “This work has been done in the simulated environments, and we think it represents future use cases. However, today, we haven’t seen this kind of consequential scheming in our production traffic. Nonetheless, it is well known that there are forms of deception in ChatGPT. You might ask it to implement some website, and it might tell you, ‘Yes, I did a great job.’ And that’s just the lie. There are some petty forms of deception that we still need to address.”

The fact that AI models from multiple players intentionally deceive humans is, perhaps, understandable. They were built by humans, to mimic humans, and (synthetic data aside) for the most part trained on data produced by humans. 

It’s also bonkers. 

While we’ve all experienced the frustration of poorly performing technology (thinking of you, home printers of yesteryear), when was the last time your not-AI software deliberately lied to you? Has your inbox ever fabricated emails on its own? Has your CMS logged new prospects that didn’t exist to pad its numbers? Has your fintech app made up its own bank transactions? 

It’s worth pondering this as the corporate world barrels toward an AI future where companies believe agents can be treated like independent employees. The researchers of this paper have the same warning.

“As AIs are assigned more complex tasks with real-world consequences and begin pursuing more ambiguous, long-term goals, we expect that the potential for harmful scheming will grow — so our safeguards and our ability to rigorously test must grow correspondingly,” they wrote. 



Source link

GT
  • Website

Keep Reading

Paragon is not collaborating with Italian authorities probing spyware attacks, report says

Zoom teams up with World to verify humans in meetings

Hackers are abusing unpatched Windows security flaws to hack into organizations

‘Tokenmaxxing’ is making developers less productive than they think

Sources: Cursor in talks to raise $2B+ at $50B valuation as enterprise growth surges

Kevin Weil and Bill Peebles exit OpenAI as company continues to shed ‘side quests’

Add A Comment
Leave A Reply Cancel Reply

Editors Picks

Investors trust Google more than Meta when comes to spending on AI

April 30, 2026

Google launches training and inference TPUs in latest shot at Nvidia

April 27, 2026

Meta tracks employee usage on Google, LinkedIn AI training project

April 25, 2026

Meta will cut 10% of workforce as company pushes deeper into AI

April 24, 2026
Latest Posts

Malicious Chrome Extension Steal ChatGPT and DeepSeek Conversations from 900K Users

April 1, 2026

Top 10 Best Server Monitoring Tools

April 1, 2026

10 Best Cybersecurity Risk Management Tools

March 31, 2026

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Welcome to RoboNewsWire, your trusted source for cutting-edge news and insights in the world of technology. We are dedicated to providing timely and accurate information on the most important trends shaping the future across multiple sectors. Our mission is to keep you informed and ahead of the curve with deep dives, expert analysis, and the latest updates in key industries that are transforming the world.

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Facebook X (Twitter) Instagram
  • Home
  • About Us
  • Advertise
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2026 Robonewswire. Designed by robonewswire.

Type above and press Enter to search. Press Esc to cancel.