On Robustness-Accuracy Characterization of Language Models using Synthetic DatasetsChing-yun KoPin-Yu Chenet al.2024COLM 2024
Be Your Own Neighborhood: Detecting Adversarial Examples by the Neighborhood Relations Built on Self-Supervised LearningZhiyuan HeYijun Yanget al.2024ICML 2024
Prompting4Debugging: Red-Teaming Text-to-Image Diffusion Models by Finding Problematic PromptsZhi-yi ChinChieh-ming Jianget al.2024ICML 2024
What Would Gauss Say About Representations? Probing Pretrained Image Models using Synthetic Gaussian BenchmarksIrene KoPin-Yu Chenet al.2024ICML 2024
Towards Assurance of LLM Adversarial Robustness using Ontology-Driven ArgumentationTomas Bueno MomcilovicBeat Buesseret al.2024xAI 2024
Improving Membership Inference Attacks against Classification ModelsShlomit ShachorNatalia Razinkovet al.2024KES-IDT 2024
Overload: Latency Attacks on Object Detection for Edge DevicesErh-Chung ChenPin-Yu Chenet al.2024CVPR 2024
Advancing the Robustness of Large Language Models through Self-Denoised SmoothingJiabao JiBairu Houet al.2024NAACL 2024
Evaluating the Impact of Skin Tone Representation on Out-of-Distribution Detection Performance in DermatologyAssala BenmalekCelia Cintaset al.2024ISBI 2024
Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To!Xiangyu QiYi Zenget al.2024ICLR 2024