Sequential Estimation with Optimal Forgetting for Robust Speech Recognition
Abstract
Mismatch is known to degrade the performance of speech recognition systems. In real life applications we often encounter nonstationary mismatch sources. A general way to compensate for slowly time varying mismatch is by using sequential algorithms with forgetting. The choice of the forgetting factor is usually performed empirically on some development data, and no optimality criterion is used. In this paper we introduce a framework for obtaining optimal forgetting factor. In sequential algorithms, a recursion is usually used to calculate the required parameters so as to optimize a certain performance measure. To obtain optimal forgetting, we develop a recursion to calculate the forgetting factor that optimizes the same performance criterion as done in the original recursion. When combined together the two recursions result in a sequential algorithm that simultaneously optimizes the desired parameters and the forgetting factor. The proposed method is applied in conjunction with a sequential noise estimation algorithm, but the same principle can be extended to a wide range of sequential algorithms. The algorithm is extensively evaluated for different speech recognition tasks: the 5K Wall Street Journal task corrupted by different types of artificially added noise, a command and digit database recorded in a noisy car environment, and a 20K Japanese broadcast news task corrupted by field noise. In all situations it was found that the sequential algorithm performs as well as or better than batch estimation. In addition, the proposed optimal forgetting algorithm performs as well as the best hand tuned forgetting factor. This results in a continuously adaptive compensation technique without the need of any manual adjustment.