Overload: Latency Attacks on Object Detection for Edge DevicesErh-Chung ChenPin-Yu Chenet al.2024CVPR 2024
MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D WorldYining HongZishuo Zhenget al.2024CVPR 2024
Resource- Efficient Transformer Pruning for Finetuning of Large ModelsFatih IlhanGong Suet al.2024CVPR 2024
Grounding Everything: Emerging Localization Properties in Vision-Language TransformersWalid BousselhamFelix Petersenet al.2024CVPR 2024