Shay Maymon, Etienne Marcheret, et al.
INTERSPEECH 2013
We present the IBM speech activity detection system that was fielded in the phase 2 evaluation of the DARPA RATS (robust automatic transcription of speech) program. Key ingredients of the system are: multi-pass HMM Viterbi segmentation, fusion of multiple feature streams, file-based and speech-based normalization schemes, the use of regular and convolutional deep neural networks, and model fusion through frame-level score combination of channel-dependent models. These techniques were instrumental in achieving a 1.4% equal error rate on the RATS phase 2 evaluation data. Copyright © 2013 ISCA.
Shay Maymon, Etienne Marcheret, et al.
INTERSPEECH 2013
D. Oliveira, R. Silva Ferreira, et al.
EAGE/PESGB Workshop Machine Learning 2018
Dorit Nuzman, David Maze, et al.
SYSTOR 2011
Michelle Brachman, Qian Pan, et al.
IUI 2023