SafeMERGE: Preserving Safety Alignment in Fine-Tuned Large Language Models via Selective Layer-Wise Model MergingAladin DjuheraSwanand Ravindra Kadheet al.2025ICLR 2025
Split, Unlearn, Merge: Leveraging Data Attributes for More Effective Unlearning in LLMsSwanand Ravindra KadheFarhan Ahmedet al.2024ICML 2024
Benchmarking the Effect of Poisoning Defenses on the Security and Bias of Deep Learning ModelsNathalie Baracaldo AngelFarhan Ahmedet al.2023S&P 2023
Benchmarking the Effect of Poisoning Defenses on the Security and Bias of the Final ModelNathalie Baracaldo AngelKevin Eykholtet al.2022NeurIPS 2022
On the Feasibility of Compressing Certifiably Robust Neural NetworksPratik VaishnaviVeena Krishet al.2022NeurIPS 2022