Talk

Natural Language-Driven Virtual Screening: Enhancing Drug Discovery with LLMs and Biomedical Foundation Models

Abstract

Virtual screening methods in the cloud are becoming increasingly essential for efficiently identi-fying potential drug candidates. However, traditional tools for searching and filtering compound and fragment databases often lack the flexibility and level of integration required to accommo-date the diverse needs of the broader research community. Here, we present a cloud-deployed conversational assistant in the form of a web application that leverages large language models (LLMs) and an agentic approach to provide more flexible and creative database screening through natural language. This includes substructure-based filtering, enabling more precise identification of relevant compounds. Our solution also integrates IBM Bi-omedical Foundation Models (BMFM) to enhance existing datasets with biomedical and chemical information on demand, helping users obtain deeper insights without manual data augmentation. BMFM is thereby provided through IBM OpenAD, an open-source framework for molecular and materials discovery developed by IBM Research. Throughout the process the assistant operates without sending sensitive data directly to the LLMs, ensuring confidentiality and security in cloud environments if LLMs are served outside of the private infrastructure. In this presentation, we will discuss the architecture and workflow of our system and its capabili-ties in screening compound databases, offering a template for incorporating AI-driven tools into virtual drug discovery workflows operated by natural language. We believe that democratizing ac-cess to advanced drug discovery technologies by conversational interactions has strong potential to increase adoption and participation in drug discovery.

Related