Fast approximate i-vector estimation using PCA
Abstract
The i-vector representation has become increasingly popular in speaker and language recognition systems. The estimation of the projection matrix of the i-vector model is usually performed using the iterative expectation maximization (EM) algorithm. This work presents a novel approach to estimate the projection matrix of the i-vector representation and to estimate the i-vector representation for each utterance. In this approach, we formulate the estimation of the projection matrix as a principal component analysis (PCA) problem. Using the relation between PCA and a linear Gaussian model trained using the EM algorithm, we show that an approximate solution of the i-vector estimation can be obtained as the solution of a PCA problem. We evaluate the performance of our approximate i-vector estimation on the language recognition task of the robust automatic transcription of speech (RATS) project. The proposed approach reduces by 50% relative the computational time required to estimate the i-vector projection matrix and by 42% relative the computational time to estimate the i-vector representation compared to the standard EM-based approach to i-vector estimation. In addition, our experiments show improvements up to 29% relative in language recognition performance in terms of equal error rate compared to the standard EM-based i-vector estimation.