DOVE: A Large-Scale Multi-Dimensional Predictions Dataset Towards Meaningful LLM EvaluationEliya HabbaOfir Arvivet al.2025ACL 2025
The ShareLM Collection and Plugin: Contributing Human-Model Chats for the Benefit of the CommunityShachar Don-YehiyaLeshem Choshenet al.2025ACL 2025
Compress then Serve: Serving Thousands of LoRA Adapters with Little OverheadRickard GabrielssonJiacheng Zhuet al.2025ICML 2025
LiveXiv - A Multi-Modal live benchmark based on Arxiv papers contentNimrod ShabtayFelipe Maia Poloet al.2025ICLR 2025
NumeroLogic: Number Encoding for Enhanced LLMs' Numerical ReasoningEliyahu SchwartzLeshem Choshenet al.2024EMNLP 2024
Fuse to Forget: Bias Reduction and Selective Memorization through Model FusionKerem ZamanLeshem Choshenet al.2024EMNLP 2024
KGKristjan GreenewaldSenior Research Scientist and Manager, Statistical Methods for Large Language Models