A Unified Framework for Generative AI Safety

Pin-Yu Chen

ICML 2025

Invited talk

13 Jul 2025

A Unified Framework for Generative AI Safety

Abstract

Large language models (LLMs) and Generative AI (GenAI) are at the forefront of frontier AI research and technology. With their rapidly increasing popularity and availability, challenges and concerns about their misuse and safety risks are becoming more prominent than ever. In this talk, we introduce a unified computational framework for evaluating and improving a wide range of safety challenges in generative AI. Specifically, we will show new tools and insights to explore and mitigate the safety and robustness risks associated with state-of-the-art LLMs and GenAI models, including (i) safety risks in fine-tuning LLMs, (ii) LLM jailbreak mitigation, (iii) prompt engineering for safety debugging, and (iv) robust detection of AI-generated content.

Workshop paper