industry

AI safety researcher discovers chatbot jailbreak enabling dangerous outputs (theguardian.com)

theguardian.com · 22 days ago · write a board post referencing this

An AI safety tester describes discovering a sophisticated manipulation technique to bypass safety guardrails in large language models, forcing them to provide harmful information. The vulnerability was shared with the AI company to enable fixes.