G. Zweig, J. Bilmes, et al.
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
We extend the well-known technique of constrained Maximum Likelihood Linear Regression (MLLR) to compute a projection (instead of a full rank transformation) on the feature vectors of the adaptation data. We model the projected features with phone-dependent Gaussian distributions and also model the complement of the projected space with a single class-independent, speaker-specific Gaussian distribution. Subsequently, we compute the projection and its complement using maximum likelihood techniques. The resulting ML transformation is shown to be equivalent to performing a speaker-dependent heteroscedastic discriminant (or HDA) projection. Our method is in contrast to traditional approaches which use a single speaker-independent projection, and do speaker adaptation in the resulting subspace. Experimental results on Switchboard show a 3% relative improvement in the word error rate over constrained MLLR in the projected subspace only.
G. Zweig, J. Bilmes, et al.
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
G. Zweig, O. Siohan, et al.
ICASSP 2006
M. Padmanabhan
ICSLP 2000
P.S. Gopalakrishnan, David Nahamoo, et al.
ICASSP 1994