Ilias Iliadis
International Journal On Advances In Networks And Services
Generative AI applications are currently transforming industries by their ability to answer questions and generate content. Although LLMs are trained with an immense amount of information, generated results may be hallucinatory or not up-to-date. Hence, semantic search technologies providing context-relevant input is indispensable to reduce these effects. This data is extracted using a process called Retrieval Augmented Generation (RAG) that extracts related facts from large data stores such as a Vector DBs. The number of vectors to be searched is growing towards several billions and can no longer be kept in DRAM motivating the offloading into storage devices. We present CSD SSD controller architectures performing in-storage similarity searches and review data placement strategies for highly-parallelized processing of similarity searches in storage that can scale to multiple billions of vectors within a single device. In particular, we present results from an implementation using inverted index and graph-based approaches providing coarse and fine-grained searching capabilities and introduce NVMe CSD interfaces to handle Vector DB information and perform searches efficiently.
Ilias Iliadis
International Journal On Advances In Networks And Services
Erik Altman, Jovan Blanusa, et al.
NeurIPS 2023
Haoran Qiu, Weichao Mao, et al.
ASPLOS 2024
Jose Manuel Bernabe' Murcia, Eduardo Canovas Martinez, et al.
MobiSec 2024