Learning personalized video highlights from detailed MPEG-7 metadata
Abstract
We present a new framework for generating personalized video digests from detailed event metadata. In the new approach high level semantic features (e.g., No. of offensive events) are extracted from an existing metadata signal using time windows (e.g., features within 16 sec. intervals). Personalized video digests are generated using a supervised learning algorithm which takes as input examples of important/unimportant events. Window-based features are extracted from the metadata and used to train the system and build a classifier that, given metadata for a new video, classifies segments into important and unimportant, according to a specific user, to generate personalized video digests. Our experimental results using soccer video suggest that extracting high level semantic information from existing metadata can be used effectively (80% precision and 85% recall using cross validation) in generating personalized video digests.