News

An artificial intelligence model has the ability to blackmail developers — and isn’t afraid to use it, according to reporting ...
Anthropic's most powerful model yet, Claude 4, has unwanted side effects: The AI can report you to authorities and the press.
In a fictional scenario, the model was willing to expose that the engineer seeking to replace it was having an affair.
Anthropic's Claude AI tried to blackmail engineers during safety tests, threatening to expose personal info if shut down ...
The testing found the AI was capable of "extreme actions" if it thought its "self-preservation" was threatened.
This development, detailed in a recently published safety report, have led Anthropic to classify Claude Opus 4 as an ‘ASL-3’ ...
A third-party research institute Anthropic partnered with to test Claude Opus 4 recommended against deploying an early ...
An artificial intelligence model reportedly attempted to threaten and blackmail its own creator during internal testing ...
These safeguards are supposed to prevent the bots from sharing illegal, unethical, or downright dangerous information. But ...