FOCUS: Clustering crowdsourced videos by line-of-sight
Abstract
Crowdsourced video often provides engaging and diverse perspectives not captured by professional videographers. Broad appeal of user-uploaded video has been widely confirmed: freely distributed on YouTube, by subscription on Vimeo, and to peers on Facebook/Google+. Unfortunately, user-generated multimedia can be difficult to organize; these services depend on manual "tagging" or machine-mineable viewer comments. While manual indexing can be effective for popular, well-established videos, newer content may be poorly searchable; live video need not apply. We envisage video-sharing services for live user video streams, indexed automatically and in realtime, especially by shared content. We propose FOCUS, for Hadoop-on-cloud video-analytics. FOCUS uniquely leverages visual, 3D model reconstruction and multimodal sensing to decipher and continuously track a video's line-of-sight. Through spatial reasoning on the relative geometry of multiple video streams, FOCUS recognizes shared content even when viewed from diverse angles and distances. In a 70-volunteer user study, FOCUS' clustering correctness is roughly comparable to humans.