Trustworthy Generation
Data is key to technological innovations. We develop theoretical and algorithmic frameworks for generative AI to synthesize realistic, diverse, and targeted data. Our methods facilitate data augmentation for trustworthy machine learning and accelerate novel designs for drug and material discovery, and beyond.
Our work
Teaching AI models to improve themselves
ResearchPeter HessWhat is retrieval-augmented generation?
ExplainerKim MartineauAccelerating molecular optimization with AI
Deep DivePayel Das, Samuel Hoffman, Vijil Chenthamarakshan, Kahini Wadhawan, and Pin-Yu Chen11 minute readAI boosts the discovery of metamaterials vital for next-gen gadgets
ResearchYoussef Mroueh, Karthikeyan Shanmugam, and Payel Das10 minute readIBM AI finds new peptides – paving the way to better drug design
ResearchAleksandra Mojsilovic and Payel Das4 minute readDualTKB: A Dual Learning Bridge between Text and Knowledge Base
ResearchPierre Dognin6 minute readImage captioning as an assistive technology
NewsYoussef Mroueh5 minute read
Publications
The Literary Canons of Large-Language Models: An Exploration of the Frequency of Novel and Author Generations Across Gender, Race and Ethnicity, and Nationality
- Paulina Toro Isaza
- Nalani Kopp
- 2025
- NAACL 2025
A transfer learning framework for weak to strong generalization
- Seamus Somerstep
- Felipe Maia Polo
- et al.
- 2025
- ICLR 2025
Out-of-Distribution Detection using Synthetic Data Generation
- Momin Abbas
- Muneeza Azmat
- et al.
- 2025
- ICLR 2025
Large Language Models can Become Strong Self-Detoxifiers
- Irene Ko
- Pin-Yu Chen
- et al.
- 2025
- ICLR 2025
Contextual Value Alignment
- Kush Varshney
- Miao Liu
- et al.
- 2025
- ICASSP 2025
Combinatorial Test Design Model Creation using Large Language Models
- Debbie Furman
- Eitan Farchi
- et al.
- 2025
- IWCT 2025