Publication
IEEE Transactions on Speech and Audio Processing
Paper

Alignment-based codeword-dependent cepstral normalization

View publication

Abstract

This paper proposes the alignment-based codeword dependent cepstral normalization algorithm (ACDCN) which aims to alleviate the acoustical mismatch that occurs when the speech recognizer faces environmental conditions not observed in the training data. ACDCN is based on the linear channel model of the environment originally proposed by Acero and on the CDCN solution to this model [1]. ACDCN substitutes the codebook (Gaussian mixture model) employed by CDCN for the state distributions employed by the recognizer's HMMs under the assumption that these HMM distributions will model the associated speech segments better than the general GMM distribution. The feature-frame to HMM-state association is obtained through an alignment of a first decoding-pass hypothesis. From this alignment, ACDCN obtains an estimate of the environmental parameters (noise and channel vectors) which are then employed to obtain an MMSE estimate of the clean speech vectors, in a way similar to [1]. ACDCN produces an overall reduction of the error rate of over 30% in the noise range of 0 to 20 dB on experiments conducted on the Aurora-2 noisy digits database.

Date

Publication

IEEE Transactions on Speech and Audio Processing

Authors

Topics

Share