Samuel Thomas

Title

Senior Research Scientist - Speech Recognition and Spoken Language Understanding

Bio

Samuel Thomas received his B.Tech degree in Computer Engineering from the Cochin University of Science and Technology, India and M.S degree in Computer Science and Engineering from the Indian Institute of Technology Madras, India before earning his Doctor of Philosophy degree from the Johns Hopkins University, Baltimore. Since graduation, he has been at the IBM T.J. Watson Research Center, New York with the Speech Technologies Group. In the past, he has worked on several speech research projects and workshops with the Center for Language and Speech Processing (CLSP) at JHU, the Idiap Research Institute, Switzerland and the TeNeT group, IIT Madras. His research interests include speech processing and machine learning for speech recognition, spoken language understanding, speech synthesis and speaker recognition. Samuel is an IBM Master Inventor, a Senior Member of the IEEE and also an Associate Editor of the IEEE/ACM Transactions on Audio, Speech, and Language Processing. He is also an elected member of the IEEE Speech and Language Technical Committee (SLTC).

Publications

Towards End-to-end Integration of Dialog History For Improved Spoken Language Understanding
- - Vishal Sunder
  - Samuel Thomas
  - et al.
- 2022
- ICASSP 2022
Improving End-to-End Models for Set Prediction in Spoken Language Understanding
- - Jeff Kuo
  - Zoltan Tuske
  - et al.
- 2022
- ICASSP 2022
Cascaded multilingual audio-visual learning from videos
- - Andrew Rouditchenko
  - Angie Boggust
  - et al.
- 2021
- INTERSPEECH 2021
AVLnet: Learning audio-visual language representations from instructional videos
- - Andrew Rouditchenko
  - Angie Boggust
  - et al.
- 2021
- INTERSPEECH 2021
Speak or chat with me: End-to-end spoken language understanding system with flexible inputs
- - Sujeong Cha
  - Wangrui Hou
  - et al.
- 2021
- INTERSPEECH 2021
Knowledge distillation based training of universal ASR source models for cross-lingual transfer
- - Takashi Fukuda
  - Samuel Thomas
- 2021
- INTERSPEECH 2021
Integrating dialog history into end-to-end spoken language understanding systems
- - Jatin Ganhotra
  - Samuel Thomas
  - et al.
- 2021
- INTERSPEECH 2021
Resource-efficient TDNN Architectures for Audio-visual Speech Recognition
- - Alexandros Koumparoulis
  - Gerasimos Potamianos
  - et al.
- 2021
- EUSIPCO 2021
RNN transducer models for spoken language understanding
- - Samuel Thomas
  - Hong-Kwang J. Kuo
  - et al.
- 2021
- ICASSP 2021
End-to-end spoken language understanding using transformer networks and self-supervised pre-trained features
- - Edmilson Morais
  - Hong-Kwang J. Kuo
  - et al.
- 2021
- ICASSP 2021

Visit Google Scholar

Top collaborators

Samuel Thomas

Title

Bio

Publications

Towards End-to-end Integration of Dialog History For Improved Spoken Language Understanding

Improving End-to-End Models for Set Prediction in Spoken Language Understanding

Cascaded multilingual audio-visual learning from videos

AVLnet: Learning audio-visual language representations from instructional videos

Speak or chat with me: End-to-end spoken language understanding system with flexible inputs

Knowledge distillation based training of universal ASR source models for cross-lingual transfer

Integrating dialog history into end-to-end spoken language understanding systems

Resource-efficient TDNN Architectures for Audio-visual Speech Recognition

RNN transducer models for spoken language understanding

End-to-end spoken language understanding using transformer networks and self-supervised pre-trained features

Patents

Annealed Dropout Training Of Neural Networks

Denoising A Signal

Using Long Short Term Memory Recurrent Neural Network For Speaker Diarization Segmentation

Combining Installed Audio-visual Sensors With Ad-hoc Mobile Audio-visual Sensors For Smart Meeting Rooms

Acoustic Model Training

Multi-pass Speech Activity Detection Strategy To Improve Automatic Speech Recognition

Combining Installed Audio-visual Sensors With Ad-hoc Mobile Audio-visual Sensors For Smart Meeting Rooms

Acoustic Model Training

Acoustic Model Training

Combining Installed Audio-visual Sensors With Ad-hoc Mobile Audio-visual Sensors For Smart Meeting Rooms

Top collaborators

Rogerio Feris

Brian Kingsbury

Takashi Fukuda

Gakuto Kurata