Speech
As more of the world moves online, the demand for systems that can understand users and speak to them in natural language is growing exponentially. At IBM Research, we're working on next-generation AI that learns to decipher and replicate the way humans speak.
Our work
Training a customer service bot to sound more human
ResearchKim MartineauConverting several audio streams into one voice makes it easier for AI to learn
ResearchKim MartineauThe pandemic changed the way we understand speech
ResearchRachel OstrandAustin or Boston? Making artificial speech more expressive, natural, and controllable
ResearchSlava Shechtman, Raul Fernandez, and David Haws8 minute readSpeech-to-text AI could help doctors prescribe placebo to ease chronic pain
ResearchSara Berger6 minute readA cognitive in-car companion to help us enjoy the journey
Research4 minute read
Publications
Knowledge Distillation Based Training of Unified Conformer CTC Models for Multi-form ASR
- 2025
- ICASSP 2025
A Non-autoregressive Model for Joint STT and TTS
- Vishal Sunder
- Brian Kingsbury
- et al.
- 2025
- ICASSP 2025
LLM based Text Generation for Improved Low-resource Speech Recognition Models
- 2025
- ICASSP 2025
Beyond neuropsychological tests: AI speech analysis in PKU
- Susan Waisbren
- Kely Norel
- et al.
- 2024
- J. Inherit. Metab. Dis.
Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models
- Yuchen Hu
- Chen Chen
- et al.
- 2024
- NeurIPS 2024
Robust ASR Error Correction with Conservative Data Filtering
- Takuma Udagawa
- Masayuki Suzuki
- et al.
- 2024
- EMNLP 2024