What Would Gauss Say About Representations? Probing Pretrained Image Models using Synthetic Gaussian BenchmarksIrene KoPin-Yu Chenet al.2024ICML 2024
Trust Regions for Explanations via Black-Box Probabilistic CertificationAmit DhurandharSwagatam Haldaret al.2024ICML 2024
Prompting4Debugging: Red-Teaming Text-to-Image Diffusion Models by Finding Problematic PromptsZhi-yi ChinChieh-ming Jianget al.2024ICML 2024
Thermometer: Towards Universal Calibration for Large Language ModelsMaohao ShenSubhro Daset al.2024ICML 2024
Be Your Own Neighborhood: Detecting Adversarial Examples by the Neighborhood Relations Built on Self-Supervised LearningZhiyuan HeYijun Yanget al.2024ICML 2024
Slicing Mutual Information Generalization Bounds for Neural NetworksKimia NadjahiKristjan Greenewaldet al.2024ICML 2024
Larimar: Large Language Models with Episodic Memory ControlPayel DasSUBHAJIT CHAUDHURYet al.2024ICML 2024
Split, Unlearn, Merge: Leveraging Data Attributes for More Effective Unlearning in LLMsSwanand Ravindra KadheFarhan Ahmedet al.2024ICML 2024
CharED: Character-wise Ensemble Decoding for Large Language ModelsKevin GuEva Tueckeet al.2024ICML 2024
Towards Assurance of LLM Adversarial Robustness using Ontology-Driven ArgumentationTomas Bueno MomcilovicBeat Buesseret al.2024xAI 2024