Dimitri Kanevsky, David Nahamoo, et al.
ICASSP 2011
The acoustic-modeling problem in automatic speech recognition is examined with the specific goal of unifying discrete and continuous parameter approaches. To model a sequence of information-bearing acoustic feature vectors which has been extracted from the speech waveform via some appropriate front-end signal processing, a speech recognizer basically faces two alternatives: a) assign a multivariate probability distribution directly to the stream of vectors, or b) use a time-synchronous labeling acoustic processor to perform vector quantization on this stream, and assign a multinomial probability distribution to the output of the vector quantizer. With few exceptions, these two methods have traditionally been given separate treatment. Here we consider a class of very general hidden Markov models which can accommodate feature vector sequences lying either in a discrete or in a continuous space; the new class allows one to represent the prototypes in an assumption limited, yet convenient way, as tied mixtures of simple multivariate densities. Speech recognition experiments, reported for two (5000- and 20 000-word vocabularly) office correspondence tasks, demonstrate some of the benefits associated with this technique. © 1990 IEEE
Dimitri Kanevsky, David Nahamoo, et al.
ICASSP 2011
Mukund Padmanabhan, Lalit R. Bahl, et al.
IEEE Transactions on Speech and Audio Processing
Eveline J. Bellegarda, Jerome R. Bellegarda, et al.
IEEE Transactions on Pattern Analysis and Machine Intelligence
Tara N. Sainath, Bhuvana Ramabhadran, et al.
IEEE Transactions on Audio, Speech and Language Processing