Workshop paper

Foundation Models Enabling Multi-Scale Battery Materials Discovery: From Molecules To Devices

Abstract

Recent years have seen fast emergence and adoption of chemical foundation models in computational material science for property prediction and generation tasks that are focused mostly on small molecules or crystals. Despite these paradigm shifts, integration of newly discovered materials in real world devices continues to be a challenge due to design problems. New candidate material must be optimized to achieve compatibility with other components in the system and deliver the target performance. Chemical foundation model benchmarks must evaluate their scope in predicting macro scale outcomes that are the result of chemical interactions in multi-variate design space. This study evaluates performance of chemical foundation models that are pre-trained primarily with SMILES of small molecules, in extrapolating learning from molecules to material design challenges across multiple length scale in batteries. Ten prediction models are trained covering molecular properties, formulations performance, and battery device measurement. Material representations from several foundation models are compared and their performance is benchmarked against conventional molecular representations such as Morgan Fingerprints. The study further examines their capacity to generalize to out-of-distribution cases by quantifying prediction errors for novel material designs that differ substantially from the training data. Finally, interpretability of the trained predictors is assessed by correlating actual outcomes and predictions to the chemical moieties in the datasets, with the aim of enabling researchers to interpret design rules in chemical space where model has high confidence.