Never Train from Scratch: Fair Comparison of Long-Sequence Models Requires Data-Driven Priors (Extended Abstract)Ido AmosJonathan Berantet al.2025IJCAI 2025
SCROLLS: Standardized CompaRison Over Long Language SequencesUri ShahamElad Segalet al.2022EMNLP 2022
Neural network gradient-based learning of black-box function interfacesAlon JacoviGuy Hadashet al.2019ICLR 2019