Leshem Choshen

Publications

Benchmark Agreement Testing Done Right: A Guide for LLM Benchmark Evaluation
- - Yotam Perlitz
  - Ariel Gera
  - et al.
- 2025
- NeurIPS 2025
Elements of World Knowledge (EWoK): A Cognition-Inspired Framework for Evaluating Basic World Knowledge in Language Models
- - Anna A. Ivanova
  - Aalok Sathe
  - et al.
- 2025
- Transactions of the Association for Computational Linguistics
The ShareLM Collection and Plugin: Contributing Human-Model Chats for the Benefit of the Community
- - Shachar Don-Yehiya
  - Leshem Choshen
  - et al.
- 2025
- ACL 2025
DOVE: A Large-Scale Multi-Dimensional Predictions Dataset Towards Meaningful LLM Evaluation
- - Eliya Habba
  - Ofir Arviv
  - et al.
- 2025
- ACL 2025
Global MMLU: Understanding and Addressing Cultural and Linguistic Biases in Multilingual Evaluation
- - Shivalika Singh
  - Angelika Romanou
  - et al.
- 2025
- ACL 2025
The ShareLM Collection and Plugin: Contributing Human-Model Chats for the Benefit of the Community
- - Shachar Don-Yehiya
  - Leshem Choshen
  - et al.
- 2025
- ACL 2025
A Hitchhiker's Guide to Scaling Law Estimation
- - Leshem Choshen
  - Yang Zhang
  - et al.
- 2025
- ICML 2025
Compress then Serve: Serving Thousands of LoRA Adapters with Little Overhead
- - Rickard Gabrielsson
  - Jiacheng Zhu
  - et al.
- 2025
- ICML 2025
A Lossless Compression for AI Models
- - Moshik Hershcovitch
  - Andrew Wood
  - et al.
- 2025
- CLOUD 2025
The future of open human feedback
- - Shachar Don-Yehiya
  - Ben Burtenshaw
  - et al.
- 2025
- Nature Machine Intelligence

Visit Google Scholar

Top collaborators

Michal Shmueli-Scheuer

Distinguished Engineer, AI Benchmarking and Evaluation

Danny Harnik

STSM, Cloud Storage

Eyal Shnarch

Senior Research Scientist

Yotam Perlitz

Research Staff Member