Cookie Consent Has Disparate Impact on Estimation AccuracyErik MiehlingRahul Nairet al.2023NeurIPS 2023
On the Safety of Interpretable Machine Learning: A Maximum Deviation ApproachDennis WeiRahul Nairet al.2022NeurIPS 2022
User Driven Model Adjustment via Boolean Rule ExplanationsElizabeth DalyMassimilliano Mattettiet al.2021AAAI 2021
AIMEE: Interactive model maintenance with rule-based surrogatesOwen CornecRahul Nairet al.2021NeurIPS 2021
Explaining knock-on effects of bias mitigationSvetoslav NizhnichenkovRahul Nairet al.2023NeurIPS 2023
Contrastive Explanations for Comparing Preferences of Reinforcement LearningJasmina GajcinRahul Nairet al.2022AAAI 2022
Attack Atlas: A Practitioner's Perspective on Challenges and Pitfalls in Red Teaming GenAIAmbrish RawatStefan Schoepfet al.2024NeurIPS 2024