LiveXiv - A Multi-Modal live benchmark based on Arxiv papers contentNimrod ShabtayFelipe Maia Poloet al.2025ICLR 2025
ConMe: Rethinking Evaluation of Compositional Reasoning for Modern VLMsIrene HuangWei Linet al.2024NeurIPS 2024
NumeroLogic: Number Encoding for Enhanced LLMs' Numerical ReasoningEliyahu SchwartzLeshem Choshenet al.2024EMNLP 2024
MAEDAY: MAE for few- and zero-shot AnomalY-DetectionEli SchwartzAssaf Arbelleet al.2024Computer Vision and Image Understanding
PromptonomyViT: Multi-Task Prompt Learning Improves Video Transformers using Synthetic Scene DataRoei HerzigOfir Abramovichet al.2024WACV 2024
Dense and Aligned Captions (DAC) Promote Compositional Reasoning in VL ModelsSivan DovehAssaf Arbelleet al.2023NeurIPS 2023
Incorporating Structured Representations into Pretrained Vision & Language Models Using Scene GraphsRoi HerzigAlon Mendelsonet al.2023EMNLP 2023
Teaching Structured Vision & Language Concepts to Vision & Language ModelsSivan DovehAssaf Arbelleet al.2023CVPR 2023
ConStruct-VL: Data-Free Continual Structured VL Concepts LearningJames SmithPaola Cascante-bonillaet al.2023CVPR 2023
CODA-Prompt: COntinual Decomposed Attention-based Prompting for Rehearsal-Free Continual LearningJames Seale SmithLeonid Karlinskyet al.2023CVPR 2023