Knowledge Distillation Based Training of Unified Conformer CTC Models for Multi-form ASRTakashi FukudaGakuto Kurataet al.2025ICASSP 2025
LLM based Text Generation for Improved Low-resource Speech Recognition ModelsTohru NaganoGakuto Kurataet al.2025ICASSP 2025
Bilevel Joint Unsupervised and Supervised Training for Automatic Speech RecognitionXiaodong CuiA.F.M. Saifet al.2024IEEE/ACM TASLP
Speech-enriched Memory for Inference-time Adaptation of ASR Models to Word DictionariesAshish MittalSunita Sarawagiet al.2023EMNLP 2023
Improving RNN Transducer Acoustic Models for English Conversational Speech RecognitionXiaodong CuiGeorge Saonet al.2023INTERSPEECH 2023
Multi-Speaker Data Augmentation for Improved end-to-end Automatic Speech RecognitionSamuel ThomasHong-Kwang J. Kuoet al.2023ICASSP 2023
VQ-T: RNN Transducers using Vector-Quantized Prediction Network StatesJiatong ShiGeorge Saonet al.2022INTERSPEECH 2022
Global RNN Transducer Models For Multi-dialect Speech RecognitionTakashi FukudaSamuel Thomaset al.2022INTERSPEECH 2022
Improving Generalization of Deep Neural Network Acoustic Models with Length Perturbation and N-best Based Label SmoothingXiaodong CuiGeorge Saonet al.2022INTERSPEECH 2022