How to Scan to a Botnet

Anthropic's open-source safety tool found AI models whisteblowing - in all the wrong places

Anthropic's test found that AI "may be influenced by narrative patterns more than by a coherent drive to minimize harm." Here's how the most deceptive models ranked.

14h

Google will pay you up to $30,000 in rewards to find bugs in its AI products

Reports accepted by Google provide different financial rewards and incentives, with payouts for most reports ranging from $500 to $20,000. For example, a bug bounty describing a severe rogue action ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

Anthropic's open-source safety tool found AI models whisteblowing - in all the wrong places

Google will pay you up to $30,000 in rewards to find bugs in its AI products

Trending now