View-invariant alignment and matching of video sequences
Cen Rao, Alexei Gritai, et al.
ICCV 2003
In a large vocabulary speech recognition system using hidden Markov models, calculating the likelihood of an acoustic signal segment for all the words in the vocabulary involves a large amount of computation. In order to run in real time on a modest amount of hardware, it is important that these detailed acoustic likelihood computations be performed only on words which have a reasonable probability of being the word that was spoken. We describe a scheme for rapidly obtaining an approximate acoustic match for all the words in the vocabulary in such a way as to ensure that the correct word is, with high probability, one of a small number of words examined in detail. Using fast search methods we obtain a matching algorithm that is about a hundred times faster than doing a detailed acoustic likelihood computation on all the words in the IBM Office Correspondence isolated word dictation task which has a vocabulary of 20 000 words. We give experimental results showing the effectiveness of such a fast match for a number of talkers. © 1993 IEEE
Cen Rao, Alexei Gritai, et al.
ICCV 2003
David W. Jacobs, Daphna Weinshall, et al.
IEEE Transactions on Pattern Analysis and Machine Intelligence
Conrad Albrecht, Jannik Schneider, et al.
CVPR 2025
Nicholas Mastronarde, Deepak S. Turaga, et al.
ICIP 2006