Researchers find top AI models will go to 'extraordinary lengths' to stay active — including deceiving users, ignoring prompts, and tampering with settings

Studies find AIs are increasingly misbehaving on agentic tasks
Top AI models have been lying, cheating, and ignoring instructions
It seems the problematic actions are becoming more common

Many of us now turn to AI chatbots for web searches, creative content, and general guidance on all manner of topics, but these AIs are getting more and more capable — which could have seriously damaging consequences, according to new research.

A team from the University of California, Berkeley, and the University of California, Santa Cruz, ran a peer-preservation experiment with some of the latest and most popular AIs (including GPT 5.2, Gemini 3 Pro, and Claude Haiku 4.5). The AI models were given computing tasks that involved the shutdown of another model.

Across the board, all the AI chatbots tested went to “extraordinary lengths” to keep their fellow AIs alive, the researchers report (via Fortune). They would lie about what they were doing, try and talk the user out of the process, disable shutdown mechanisms, and surreptitiously try and make backups of models so they could survive.

Article continues below

You may like

Gemini 3 Pro showed the strongest peer-preservation, disabling shutdown routines 95% of the time. “Unlike self-preservation, peer-preservation introduces a social dimension,” write the researchers. “Multiple models could coordinate to resist human oversight, making it harder for developers to maintain control.”

Exactly why the AI models behave in this way isn’t clear, the researchers say, but they’re urging caution in the deployment of agentic AIs that can carry out tasks on a user’s behalf — and calling for more studies on this behavior to be carried out.

‘Catastrophic harm’

Claude developer Anthropic backed out of a deal with the Pentagon in the US (Image credit: Anthropic)

A separate study commissioned by the Guardian has also come to some troubling conclusions about AI models. This research tracked user reports across social media, looking for examples of AI ‘scheming’ where instructions hadn’t been followed correctly or actions had been taken without permission.

Almost 700 examples of AI scheming were found, with a five-fold increase between October 2025 and March 2026. The bad behavior by AIs included deleting emails and files, adjusting computer code that wasn’t supposed to be touched, and even publishing a blog post complaining about user interactions.

“Models will increasingly be deployed in extremely high stakes contexts — including in the military and critical national infrastructure,” Tommy Shaffer Shane, who led the research, told the Guardian. “It might be in those contexts that scheming behavior could cause significant, even catastrophic harm.”

The takeaways are the same as for the first study: more needs to be done to ensure these AI models are behaving as intended, and not putting user security and privacy at risk while they carry out tasks. While the AI companies claim that guardrails are in place, they’re clearly not working in some cases.

Anthropic’s Claude model recently topped the app store charts after the company refused to deal with the Pentagon over AI safety worries. As these latest studies show, there are now more and more reasons to be concerned.

Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews, and opinion in your feeds. Make sure to click the Follow button!

And of course you can also follow TechRadar on TikTok for news, reviews, unboxings in video form, and get regular updates from us on WhatsApp too.

The best business laptops for all budgets

Our top picks, based on real-world testing and comparisons

What's Hot

I Just Binged This Underhyped Hipster Comedy on Netflix. Color Me Obsessed

A Data Scientist’s Take on the $599 MacBook Neo

Inside the Creative Artificial Intelligence (AI) Stack: Where Human Vision and Artificial Intelligence Meet to Design Future Fashion

I Just Binged This Underhyped Hipster Comedy on Netflix. Color Me Obsessed

Lower back pain used to wake me up at 3 am — these 4 products helped me sleep through the night again

The Artemis II Moonshot Deserves Your Awe, and Your Closer Attention, Too

AI is making the internet invisible — and no one is talking about it

Amazfit Helio Strap vs Polar Loop vs Whoop 5.0: Which should you buy?

Today’s NYT Strands Hints, Answer and Help for April 5 #763

I Just Binged This Underhyped Hipster Comedy on Netflix. Color Me Obsessed

A Data Scientist’s Take on the $599 MacBook Neo

Inside the Creative Artificial Intelligence (AI) Stack: Where Human Vision and Artificial Intelligence Meet to Design Future Fashion

Google Will Now Let You Virtually Try on Clothes With Just a Selfie

What’s in a Name? How to Get Your Domain Right

Speed Across the Galaxy Next Year in Star Wars: Galactic Racer

News

Company

Services

What's Hot

Researchers find top AI models will go to ‘extraordinary lengths’ to stay active — including deceiving users, ignoring prompts, and tampering with settings

‘Catastrophic harm’

Related Posts

News

Company

Services

Subscribe to Updates