Think Again! The Effect of Test-Time Compute on Preferences, Opinions, and Beliefs of Large Language ModelsGeorge KourItay Nakashet al.2025ACL 2025
Exploring Straightforward Methods for Automatic Conversational Red-TeamingGeorge KourNaama Zwerdlinget al.2025NAACL 2025
Breaking ReAct Agents: Foot-in-the-Door Attack Will Get You InItay NakashGeorge Kouret al.2025NAACL 2025
A Novel Metric for Measuring the Robustness of Large Language Models in Non-adversarial ScenariosSamuel AckermanElla Rabinovichet al.2024EMNLP 2024
Unveiling Safety Vulnerabilities of Large Language ModelsGeorge KourMarcel Zalmanoviciet al.2023EMNLP 2023
Predicting Question-Answering Performance of Large Language Models through Semantic ConsistencyElla RabinovichSamuel Ackermanet al.2023EMNLP 2023
Text Augmentation Using Dataset Reconstruction for Low-Resource ClassificationAdir RahamimGuy Uzielet al.2023ACL 2023
Measuring the Measuring Tools: An Automatic Evaluation of Semantic Metrics for Text CorporaGeorge KourSamuel Ackermanet al.2022EMNLP 2022