SQL Computerphile - Search News

When AI is trained for treachery, it becomes the perfect agent

Opinion Last year, The Register reported on AI sleeper agents. A major academic study explored how to train an LLM to hide destructive behavior from its users, and how to find it before it triggered.

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

Feedback

When AI is trained for treachery, it becomes the perfect agent

Trending now