Oznur Alkan, Massimilliano Mattetti, et al.
INFORMS 2020
A serious problem in both audio and video conferencing facilities available today is the difficulty in determining who is speaking among a large number of participants. There is a strong need for developing meeting room infrastructure and teleconference facilities that improve the sense of presence and participation experienced in remote meetings. We present a distributed multimodal tracking system that uses multiple cameras and microphones to automatically select the current speaker among multiple meeting participants. The system actively obtains and transmits video showing a good view of the selected speaker. The tracking system is integrated into a web-based video conferencing application that connects seven meeting rooms around the globe. An important part of designing such a system is to determine sensor placement and configuration through systematic experiments in the actual rooms where the system is deployed.
Oznur Alkan, Massimilliano Mattetti, et al.
INFORMS 2020
Casey Dugan, Werner Geyer, et al.
CHI 2010
Rajesh Balchandran, Leonid Rachevsky, et al.
INTERSPEECH 2009
Elaine Hill
Human-Computer Interaction