VP-NTK: Exploring the Benefits of Visual Prompting in Differentially Private Data SynthesisChia-yi HsuJia You Chenet al.2025ICASSP 2025
Retention Score: Quantifying Jailbreak Risks for Vision Language ModelsZhaitang LiPin-Yu Chenet al.2025AAAI 2025
Token Highlighter: Inspecting and Mitigating Jailbreak Prompts for Large Language ModelsXiaomeng XuPin-Yu Chenet al.2025AAAI 2025
Is My Data in Your Retrieval Database? Membership Inference Attacks Against Retrieval Augmented GenerationMaya AndersonGuy Amitet al.2025ICISSP 2025
The Inherent Adversarial Robustness of Analog In-Memory ComputingCorey Liam LammieJulian Büchelet al.2025Nature Communications
Privacy without Noisy Gradients: Slicing Mechanism for Generative Model TrainingKristjan GreenewaldYuancheng Yuet al.2024NeurIPS 2024
Attack Atlas: A Practitioner's Perspective on Challenges and Pitfalls in Red Teaming GenAIAmbrish RawatStefan Schoepfet al.2024NeurIPS 2024
Membership Inference Attacks Against Time-Series ModelsNoam KorenAbigail Goldsteenet al.2024ACML 2024
MoJE: Mixture of Jailbreak Experts, Naive Tabular Classifiers as Guard for Prompt AttacksGiandomenico CornacchiaKieran Fraseret al.2024AIES 2024
On Robustness-Accuracy Characterization of Language Models using Synthetic DatasetsChing-yun KoPin-Yu Chenet al.2024COLM 2024