SPEAKER ADAPTATION BASED ON PRE-CLUSTERING TRAINING SPEAKERS
Abstract
A new strategy for speaker adaptation is described that is based on: (1) pre-clustering all the speakers in the training set acoustically into clusters; (2) for each speaker cluster, a system is built using the data from the speakers who belong to the cluster; (3) when a test speaker's data is available, we find a subset of these clusters, closest to the test speaker; (4) we transform each of the selected clusters to bring it closer to the test speaker's acoustic space; (5) we build a speaker-adapted model using transformed cluster models. This method solves the problem of excessive storage for the training speaker models^, as it is relatively inexpensive to store a model for each cluster. Also as each cluster contains a number of speakers, parameters of the models for each cluster can be robustly estimated. The algorithm has been evaluated on a large vocabulary system and comparied to existing algorithms. The imporvement over existing algorithms such as MLLFU2] is statistically significant.