A language independent approach to audio search
Vikram Gupta, Jitendra Ajmera, et al.
INTERSPEECH 2011
Discriminative training of feature space using maximum mutual information (fMMI) objective function has been shown to yield remarkable accuracy improvements. For noisy environments, fMMI can be regarded as an effective noise compensation algorithm and can play a significant role for noise robustness. Feature space speaker adaptation techniques such as feature space maximum likelihood linear regression (fMLLR) are also well known, suitable for mismatched test data. These feature space transform algorithms are essential for modern speech recognition but still need further improvement against low SNR conditions. In contrast, long-term spectro-temporal information has also received attention to support traditional short-term features. We previously proposed long-term temporal features to improve ASR accuracy for low SNR speech. In this paper, we show that longterm temporal features can be combined with fMMI to build more discriminative models for noisy speech and the proposed method performed favorably at low SNR conditions. Copyright © 2011 ISCA.
Vikram Gupta, Jitendra Ajmera, et al.
INTERSPEECH 2011
Christoph Tillmann, Sanjika Hewavitharana
INTERSPEECH 2011
Michelle Brachman, Zahra Ashktorab, et al.
PACM HCI
Gang Wang, Fei Wang, et al.
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics