Talk

Novel LLM-powered assistant facilitating data visualization in research and education

Abstract

In the era of big data, the ability to quickly interpret and visualize complex datasets is paramount for advancing scientific discovery. Despite of being used widely, traditional tools like Excel and Origin often struggle to quickly and efficiently create sophisticated visualizations on-demand from new datasets. To address this limitation, we have developed a visualization assistant that leverages large language models (LLMs) and the Vega-Lite grammar to produce a diverse array of data visualizations on-demand within seconds. This assistant not only accelerates the visualization process but also enables the creation of complex and interactive visualizations that are challenging to construct with conventional tools – or by Matplotlib as frequently used in data science. Initially, we explored fine-tuning LLMs to specialize them for our visualization tasks. However, this approach proved to be difficult and ineffective due to several drawbacks: high computational costs, lengthy training times, required skill levels, and the extreme overhead in adapting to new visualization types over time. In our talk, we will present how we overcame these challenges by employing Retrieval-Augmented Generation (RAG)-based in-context learning. We will delve into dataset creation, the architecture and workflow of our visualization assistant, and its current capabilities—including creating various chart types, incorporating aggregations, and adding interactive elements. Thereby, all visualizations can be crafted from simple natural language queries, and since the actual data is never sent directly to the LLMs, confidentiality is ensured. Furthermore, we will present recent advancements in transitioning to agentic workflows. We believe that our approach streamlines the visualization process and democratizes access to advanced on-demand visualizations, and that it can serve as a template for developing RAG-based in-context learning systems for applications in research and education.

Related