Open3DIS Open-Vocabulary 3D Instance Segmentation with 2D Mask GuidancePhuc NguyenTuan Duc Ngoet al.2024CVPR 2024
What When and Where? Self-Supervised Spatio Temporal Grounding in Untrimmed Multi-Action Videos from Narrated InstructionsBrian ChenNina Shvetsovaet al.2024CVPR 2024
Overload: Latency Attacks on Object Detection for Edge DevicesErh-Chung ChenPin-Yu Chenet al.2024CVPR 2024
MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D WorldYining HongZishuo Zhenget al.2024CVPR 2024