← Curriculum
Advanced
~18 minred-team
jailbreaks
prompt-injection
Red-teaming and adversarial testing
If you don't break it, your users will.
Step 1 of 13
Red-teaming is structured adversarial testing: deliberately trying to make the system produce unsafe, off-policy, or incorrect outputs. Done well, it produces a regression suite that catches future failures. Done poorly, it's anecdote collection.