Red Teaming

Security

Adversarial testing to find model failures

Simulate attacks and misuse to discover vulnerabilities and unsafe behaviors before public release.

Learn more about concepts related to Red Teaming

Adversarial Example

Inputs intentionally perturbed to fool models

AI Safety

Designing AI systems to avoid harmful behavior