Concept-based Interpretable Deep Learning

Mateo Espinosa Zarlenga; Pietro Barbiero

AAAI 2025

Tutorial

25 Feb 2025

Concept-based Interpretable Deep Learning

Visit website

Abstract

As notoriously opaque deep neural networks (DNNs) become commonplace in powerful Artificial Intelligence (AI) systems, there has been a sharp increase in making DNNs interpretable by construction. Concept representation Learning (CL) [1, 2, 3, 4, 5, 6], a subfield in eXplainable AI (XAI), has emerged as a promising direction for designing high-performing interpretable neural architectures. At their core, CL methods learn an intermediate set of high-level concept representations (e.g., “stripped texture”, “round object”, etc.) from which they can predict a downstream task.

Driven by recent rich representations extracted from large datasets and models, CL has moved from a field constrained by the need for concept annotations to one where practical, powerful concept representations can be exploited without costly annotations. This tutorial aims to capitalise on the surge of interest in CL by equipping AI researchers and engineers with the necessary background to understand the current state of this body of work and build on it for their own research. Specifically, this tutorial will provide an overview of foundational and recent works in (i) supervised concept learning, (ii) unsupervised concept learning, and (iii) concept-based neuro-symbolic reasoning. We will conclude by highlighting several connections between CL and other areas in AI (e.g., disentanglement learning, representation learning, bias mitigation, etc.) and bringing forth key open questions within the field.

Conference paper