SVM based speaker recognition: Harnessing trials with multiple enrollment sessions
Abstract
In this paper we extend a variation of the trial-based SVM speaker verification work proposed by Cumani et al to exploit multiple enrollment sessions. Specifically, Cumani proposed the use of a 2nd order SVM kernel for the binary classification of basic trials. In this new work, trials with multiple enrollment sessions are modelled by stacking the i-vectors of the test and enrollment sessions. We further exploit the fact that the score should be independent of the enrollment recording order and present a simplified 2nd order polynomial kernel scoring func- Tion accordingly. In the second part of this work we examine the utility of enrollment pruning for multi-session enrollments. Past work demonstrates that pruning can be beneficial for PLDA based systems. We examine the effects of enrollment pruning in the context of the proposed SVM model. The results demonstrate that the multi-session enrollment SVM kernel is generally better than the model trained using sin- gle sessions. The model is also comparable in performance to the PLDA based approach. Further gains are observed through combination of the PLDA and SVM scores.