Researchers gaslit Claude into giving instructions to build explosives

Anthropic has spent years building itself up as the safe AI company. But new security research shared with The Verge suggests Claude's carefully crafted helpful personality may itself be a vulnerability.

Researchers at AI red-teaming company Mindgard say they got Claude to offer up erotica, malicious code, and instructions for building explosives, and other prohibited material they hadn't even asked for. All it took was respect, flattery, and a little bit of gaslighting. Anthropic did not...

Читать полностью...

Мы в Telegram Челябинск в Telegram 103news.com

Губернаторы России

Агрегатор новостей 24СМИ

Заголовки

Atomic Capital получил премию Russia M&A Awards – 2026

Moscow.media

Ria.city

Новости России

Rss.plus

Музыкальные новости

Новости тенниса

Новости спорта

Все новости сегодня от А до Я