Ensembles of multi-scale VGG acoustic models
Michael Heck, Masayuki Suzuki, et al.
INTERSPEECH 2017
The stream of words produced by Automatic Speech Recognition (ASR) systems is typically devoid of punctuations and formatting. Most natural language processing applications expect segmented and well-formatted texts as input, which is not available in ASR output. This paper proposes a novel technique of jointly modeling multiple correlated tasks such as punctuation and capitalization using bidirectional recurrent neural networks, which leads to improved performance for each of these tasks. This method could be extended for joint modeling of any other correlated sequence labeling tasks.
Michael Heck, Masayuki Suzuki, et al.
INTERSPEECH 2017
Preksha Nema, Mitesh M. Khapra, et al.
ACL 2017
Anirban Laha, Vikas Raykar
COLING 2016
Shachar Mirkin, Scott Nowson, et al.
EMNLP 2015