Talk

ChemChat | Democratizing Access to Domain-Specific AI/ML through LLM-powered Conversa-tional Assistants

Abstract

In recent years, computational chemistry and machine learning have undergone transformative advancements, yielding powerful tools and AI models. Despite this progress, these resources re-main underutilized due to high technical barriers and their tendency to operate in silos. The ne-cessity for programming and ML expertise further restricts access for many domain experts, e.g. experimentalists. Meanwhile, large language models (LLMs) by OpenAI (GPT), Google (Gemini), Meta (Llama), xAI (Grok), or Anthropic (Claude) have revolutionized various sectors over the last 2 years. However, their application in chemistry—even with the recent GPT-o1—remains limited due to deficiencies in understanding scientific workflows, domain-specific tasks, access to data sources, domain-based reasoning, and accurate referencing, often leading to incorrect and hallu-cinated responses that undermine trust and reliability. This critical gap between AI and scientific disciplines can be bridged by equipping LLM-powered conversational assistants with specialized cheminformatics tools and AI models, and, according-ly, providing tailored instructions to allow correct planning of actions. This approach promises to (I) increase the adoption of cheminformatics tools and AI models, (II) democratize AI/ML accessi-bility within the field, and (III) ultimately enhance scientific discovery and education. In this talk, we introduce ChemChat, a proof-of-concept fully functional and cloud-deployed con-versational assistant for material science and data visualization, and our advancements towards agentic systems. It features a chatbot-driven web application interface and is powered by non-OpenAI LLMs. By integrating existing cheminformatics tools and advanced AI models—including PubChem, CIRCA, RDKit, GT4SD, RXN, MolFormer, DeepSearch, and other knowledge sources—ChemChat aids chemists with tasks such as property calculations, molecule design, retrosynthe-sis, data visualization, and literature research. Our presentation will include ChemChat’s archi-tecture, in-context learning, its specific use cases and a comparison to popular applications like ChatGPT.

Related