Publications

Debatable Intelligence: Benchmarking LLM Judges via Debate Speech Evaluation