FairSISA: Ensemble Post-Processing to Improve Fairness of Unlearning in LLMsSwanand Ravindra KadheAnisa Halimiet al.2023NeurIPS 2023
Subtle Misogyny Detection and Mitigation: An Expert-Annotated DatasetAnna RichterBrooklyn Sheppardet al.2023NeurIPS 2023
Probabilistic Abduction for Visual Abstract Reasoning via Learning Rules in Vector-symbolic ArchitecturesMichael HerscheFrancesco Di Stefanoet al.2023NeurIPS 2023
Weakly Supervised Detection of Hallucinations in LLM ActivationsMiriam RateikeCelia Cintaset al.2023NeurIPS 2023
Cost-Aware Counterfactuals for Black Box ExplanationsNatalia Martinez GilKanthi Sarpatwaret al.2023NeurIPS 2023
Risk Assessment and Statistical Significance in the Age of Foundation ModelsApoorva NitsureYoussef Mrouehet al.2023NeurIPS 2023
Characterizing pre-trained and task-adapted molecular representationsCelia CintasPayel Daset al.2023NeurIPS 2023
PROMINET: Prototype-based Multi-View Network for Interpretable Email Response PredictionYuqing WangPrashanth Vijayaraghavanet al.2023EMNLP 2023