Michael Picheny, Zoltan Tuske, et al.
INTERSPEECH 2019
MPE (Minimum Phone Error) is a previously introduced technique for discriminative training of HMM parameters. fMPE applies the same objective function to the features, transforming the data with a kernel-like method and training millions of parameters, comparable to the size of the acoustic model. Despite the large number of parameters, fMPE is robust to over-training. The method is to train a matrix projecting from posteriors of Gaussians to a normal size feature space, and then to add the projected features to normal features such as PLP. The matrix is trained from a zero start using a linear method. Sparsity of posteriors ensures speed in both training and test time. The technique gives similar improvements to MPE (around 10% relative). MPE on top of fMPE results in error rates up to 6.5% relative better than MPE alone, or more if multiple layers of transform are trained. © 2005 IEEE.
Michael Picheny, Zoltan Tuske, et al.
INTERSPEECH 2019
Hagen Soltau, Lidia Mangu, et al.
ASRU 2011
Mohamed Kamal Omar, Lidia Mangu
ICASSP 2007
Hagen Soltau, George Saon, et al.
IEEE Transactions on Audio, Speech and Language Processing