Fast Gaussian likelihood computation by maximum probability increase estimation for continuous speech recognition
Abstract
Speech signals are semi-stationary and speech features in neighboring frames are likely to share similar Gaussian distributions. A fast Gaussian computation algorithm is hence proposed to speed up the computation of the JV-best posterior probabilities based on a large set of Gaussian distributions for the task of large vocabulary continuous speech recognition. The maximum probability increase between the current speech frame and a previous reference frame is estimated for all Gaussian distributions in order to reduce explicit computations of posteriors for a large number of Gaussians. The method was applied to the fMPE front-end of IBM's state-of-the-art speech recognizer resulting a decoding speed-up of 40% in probability computation for a loss-less mode and more than 55% in an approximated implementation, respectively. ©2008 IEEE.