Detecting discussion scenes in instructional videos

Ying Li; Chitra Dorai

ICME 2004

Conference paper

01 Dec 2004

Detecting discussion scenes in instructional videos

Abstract

This paper addresses the problem of detecting discussion scenes in instructional videos using statistical approaches. Specifically, given a series of speech segments separated from the audio tracks of educational videos, we first model the instructor using a Gaussian mixture model (GMM), then a four-state transition machine is designed to extract discussion scenes in real-time based on detected instructor-student speaker change points. Meanwhile, we keep updating the GMM model to accommodate the instructor's voice variation along time. Promising experimental results have been achieved on five educational (IBM MicroMBA program) videos, and very interesting instruction/teaching patterns have been observed. The extracted scene information would facilitate the semantic indexing and structuralization of instructional video content.

Conference paper