Mohamed Kamal Omar, Ganesh N. Ramaswamy
ICASSP 2006
We propose a non-linear model space transformation for speaker or environment adaptation based on weighted kernel ridge regression (KRR). The transformation is given by a generalized least squares linear regression in a kernel-induced feature space operating on Gaussian mixture model means and having as targets the adaptation frames. Using the "kernel trick", the solution to the optimization problem is obtained by solving a system of linear equations involving the Gram matrix of the input variables. We show that MLLR is a special case of KRR when a linear kernel is employed. Furthermore, we study an efficient low-rank approximation to the kernel matrix termed "rectangle method", where the regressors are chosen to be a small set of clustered adaptation frames. Experiments conducted on the EARS database (English conversational telephone speech) indicate that KRR with a Gaussian RBF kernel outperforms standard regression class-based MLLR. © 2006 IEEE.
Mohamed Kamal Omar, Ganesh N. Ramaswamy
ICASSP 2006
Ellen M. Eide, Michael A. Picheny
ICASSP 2006
George Saon, Samuel Thomas, et al.
INTERSPEECH 2013
Michael Picheny, Zoltan Tuske, et al.
INTERSPEECH 2019