Pic2Word: Mapping Pictures to Words for Zero-shot Composed Image RetrievalKuniaki SaitoKihyuk Sohnet al.2023CVPR 2023
MaskSketch: Unpaired Structure-guided Masked Image GenerationDina BashkirovaJosé Lezamaet al.2023CVPR 2023
Mod-Squad: Designing Mixtures of Experts As Modular Multi-Task LearnersZitian ChenYikang Shenet al.2023CVPR 2023
Learning Situation Hyper-Graphs for Video Question AnsweringAisha Urooj KhanHilde Kuehneet al.2023CVPR 2023
ScaleFL: Resource-Adaptive Federated Learning with Heterogeneous ClientsFatih IlhanGong Suet al.2023CVPR 2023
Teaching Structured Vision & Language Concepts to Vision & Language ModelsSivan DovehAssaf Arbelleet al.2023CVPR 2023
Physics-Driven Diffusion Models for Impact Sound Synthesis from VideosKun SuKaizhi Qianet al.2023CVPR 2023