Position: Theory of Mind Benchmarks are Broken for Large Language ModelsMatthew RiemerZahra Ashktorabet al.2025ICML 2025
Exploring Straightforward Methods for Automatic Conversational Red-TeamingGeorge KourNaama Zwerdlinget al.2025NAACL 2025
The Literary Canons of Large-Language Models: An Exploration of the Frequency of Novel and Author Generations Across Gender, Race and Ethnicity, and NationalityPaulina Toro IsazaNalani Kopp2025NAACL 2025
A transfer learning framework for weak to strong generalizationSeamus SomerstepFelipe Maia Poloet al.2025ICLR 2025
Out-of-Distribution Detection using Synthetic Data GenerationMomin AbbasMuneeza Azmatet al.2025ICLR 2025
Combinatorial Test Design Model Creation using Large Language ModelsDebbie FurmanEitan Farchiet al.2025IWCT 2025
Foundation Models at Work: Fine-Tuning for Fairness in Algorithmic HiringBuse KorkmazRahul Nairet al.2025AAAI 2025