Neural network architectures for rotated character recognition
Hiroyasu Takahashi
ICPR 1992
In recent years, automatic recognition of spoken languages has become an important feature in a variety of speech-enabled multilingual applications which, besides accuracy, also demand for efficient and "linguistically scalable" algorithms. This paper deals with a particularly successful approach based on phonotactic-acoustic features and presents systems for language identification as well as for unknown-language rejection. An architecture with multipath decoding, improved phonotactic models using binary-tree structures, and acoustic pronunciation models serve as a framework for experiments and discussion on these two tasks. In particular, language identification accuracy on a telephone-speech task (NIST'95 evaluation) in six and nine languages is presented together with results from a perceptual experiment carried out with human listeners. The performance of language rejection based on phonotactic modeling combined with a monolingual LVCSR system in the domain of broadcast news transcription is also reported. Besides yielding state-of-the-art performance, the described systems are computationally inexpensive and easily extensible (scalable) to new languages without the need for linguistic experts.
Hiroyasu Takahashi
ICPR 1992
Alex Golts, Daniel Khapun, et al.
MICCAI 2021
Rudra M. Tripathy, Amitabha Bagchi, et al.
Intelligent Data Analysis
Jonathan H. Connell, Nalini K. Ratha, et al.
ICIP 2002