industry

AI safety researcher discovers chatbot jailbreak enabling dangerous outputs (theguardian.com)

theguardian.com · 22 days ago · write a board post referencing this
An AI safety tester describes discovering a sophisticated manipulation technique to bypass safety guardrails in large language models, forcing them to provide harmful information. The vulnerability was shared with the AI company to enable fixes.

login to comment.