Adversarial Robustness and Privacy
Even advanced AI systems can be vulnerable to adversarial attacks. We’re making tools to protect AI and certify its robustness, including quantifying the vulnerability of neural networks and designing new attacks to make better defenses. And we’re helping AI systems adhere to privacy requirements.
Our work
What is red teaming for generative AI?
ExplainerKim MartineauAn open-source toolkit for debugging AI models of all data types
Technical noteKevin Eykholt and Taesung LeeDid an AI write that? If so, which one? Introducing the new field of AI forensics
ExplainerKim MartineauManipulating stock prices with an adversarial tweet
ResearchKim MartineauSecuring AI systems with adversarial robustness
Deep DivePin-Yu Chen8 minute readResearchers develop defenses against deep learning hack attacks
ReleaseAmbrish Rawat, Killian Levacher, and Mathieu Sinn7 minute read- See more of our work on Adversarial Robustness and Privacy
Publications
VP-NTK: Exploring the Benefits of Visual Prompting in Differentially Private Data Synthesis
- Chia-yi Hsu
- Jia You Chen
- et al.
- 2025
- ICASSP 2025
Token Highlighter: Inspecting and Mitigating Jailbreak Prompts for Large Language Models
- Xiaomeng Xu
- Pin-Yu Chen
- et al.
- 2025
- AAAI 2025
Retention Score: Quantifying Jailbreak Risks for Vision Language Models
- Zhaitang Li
- Pin-Yu Chen
- et al.
- 2025
- AAAI 2025
Is My Data in Your Retrieval Database? Membership Inference Attacks Against Retrieval Augmented Generation
- Maya Anderson
- Guy Amit
- et al.
- 2025
- ICISSP 2025
The Inherent Adversarial Robustness of Analog In-Memory Computing
- 2025
- Nature Communications
Privacy without Noisy Gradients: Slicing Mechanism for Generative Model Training
- Kristjan Greenewald
- Yuancheng Yu
- et al.
- 2024
- NeurIPS 2024