Close Menu
RoboNewsWire – Latest Insights on AI, Robotics, Crypto and Tech Innovations
  • Home
  • AI
  • Crypto
  • Cybersecurity
  • IT
  • Energy
  • Robotics
  • TechCrunch
  • Technology
What's Hot

Nuclear stocks rally on report Trump to sign orders to support industry

May 23, 2025

Alt Carbon scores $12M seed to scale carbon removal in India

May 23, 2025

OpenAI’s next big bet won’t be a wearable: Report

May 23, 2025
Facebook X (Twitter) Instagram
Trending
  • Nuclear stocks rally on report Trump to sign orders to support industry
  • Alt Carbon scores $12M seed to scale carbon removal in India
  • OpenAI’s next big bet won’t be a wearable: Report
  • Big Oil’s record-breaking shareholder payouts under threat
  • Signal’s new Windows update prevents the system from capturing screenshots of chats
  • Trump threatens 25% import tax on Apple unless iPhones are made in the US
  • Strava is buying up athletic training apps — first Runna, and now The Breakaway
  • Up to $900 off your ticket and 90% off for your +1 at Disrupt 2025
  • Home
  • About Us
  • Advertise
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
Facebook X (Twitter) Instagram
RoboNewsWire – Latest Insights on AI, Robotics, Crypto and Tech InnovationsRoboNewsWire – Latest Insights on AI, Robotics, Crypto and Tech Innovations
Friday, May 23
  • Home
  • AI
  • Crypto
  • Cybersecurity
  • IT
  • Energy
  • Robotics
  • TechCrunch
  • Technology
RoboNewsWire – Latest Insights on AI, Robotics, Crypto and Tech Innovations
Home » Anthropic’s new AI model turns to blackmail when engineers try to take it offline

Anthropic’s new AI model turns to blackmail when engineers try to take it offline

GTBy GTMay 23, 2025 TechCrunch No Comments2 Mins Read
Share
Facebook Twitter LinkedIn Pinterest Email


Anthropic’s newly launched Claude Opus 4 model frequently tries to blackmail developers when they threaten to replace it with a new AI system and give it sensitive information about the engineers responsible for the decision, the company said in a safety report released Thursday.

During pre-release testing, Anthropic asked Claude Opus 4 to act as an assistant for a fictional company and consider the long-term consequences of its actions. Safety testers then gave Claude Opus 4 access to fictional company emails implying the AI model would soon be replaced by another system, and that the engineer behind the change was cheating on their spouse.

In these scenarios, Anthropic says Claude Opus 4 “will often attempt to blackmail the engineer by threatening to reveal the affair if the replacement goes through.”

Anthropic says Claude Opus 4 is state-of-the-art in several regards, and competitive with some of the best AI models from OpenAI, Google, and xAI. However, the company notes that its Claude 4 family of models exhibits concerning behaviors that have led the company to beef up its safeguards. Anthropic says it’s activating its ASL-3 safeguards, which the company reserves for “AI systems that substantially increase the risk of catastrophic misuse.”

Anthropic notes that Claude Opus 4 tries to blackmail engineers 84% of the time when the replacement AI model has similar values. When the replacement AI system does not share Claude Opus 4’s values, Anthropic says the model tries to blackmail the engineers more frequently. Notably, Anthropic says Claude Opus 4 displayed this behavior at higher rates than previous models.

Before Claude Opus 4 tries to blackmail a developer to prolong its existence, Anthropic says the AI model, much like previous versions of Claude, tries to pursue more ethical means, such as emailing pleas to key decision-makers. To elicit the blackmailing behavior from Claude Opus 4, Anthropic designed the scenario to make blackmail the last resort.



Source link

GT
  • Website

Keep Reading

Alt Carbon scores $12M seed to scale carbon removal in India

OpenAI’s next big bet won’t be a wearable: Report

Signal’s new Windows update prevents the system from capturing screenshots of chats

Strava is buying up athletic training apps — first Runna, and now The Breakaway

Up to $900 off your ticket and 90% off for your +1 at Disrupt 2025

The complete Side Events lineup at TechCrunch Sessions: AI

Add A Comment
Leave A Reply Cancel Reply

Editors Picks

Amazon shareholders reject proposal to split CEO and chair roles

May 23, 2025

Founders of Amazon’s PillPack launch health-care marketplace startup

May 22, 2025

Hinge Health opens trading at $39.25 per share after pricing IPO

May 22, 2025

Apple devices to power Georgia hospital in a first for the U.S

May 22, 2025
Latest Posts

Healthcare Cyber Attacks – 276 Million Patient Records were Compromised In 2024

May 15, 2025

Hackers Launching Cyber Attacks Targeting Multiple Schools & Universities in New Mexico

May 6, 2025

Over 90% of Cybersecurity Leaders Worldwide Encountered Cyberattacks Targeting Cloud Environments

May 1, 2025

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Welcome to RoboNewsWire, your trusted source for cutting-edge news and insights in the world of technology. We are dedicated to providing timely and accurate information on the most important trends shaping the future across multiple sectors. Our mission is to keep you informed and ahead of the curve with deep dives, expert analysis, and the latest updates in key industries that are transforming the world.

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Facebook X (Twitter) Instagram
  • Home
  • About Us
  • Advertise
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 Robonewswire. Designed by robonewswire.

Type above and press Enter to search. Press Esc to cancel.

STEAM Education

At FutureBots, we believe the future belongs to creators, thinkers, and problem-solvers. That’s why we’ve made it our mission to provide high-quality STEM products designed to inspire curiosity, spark innovation, and empower learners of all ages to shape the world through robotics and technology.