Conference paper
Workshop paper
Serving LLMs as detectors in workflows with guardrails
Abstract
With the growing popularity of Large Language Model (LLM) usage in generative AI applications comes a growing need to be able to verify, moderate, or “guardrail” LLM inputs and outputs. “Guardrailing” can be done with anything from simple regex detections, to more complicated techniques like using LLMs themselves to detect undesired content. Additionally, considerable effort has been going into creating and optimizing various LLM serving solutions. This paper describes our experience of using an adapter pattern with an LLM serving architecture to provide LLMs as guardrail models. The details on design trade-offs, such as performance or model accessibility, can aid in creating other LLM-based software architectures.
Related
Conference paper
Do not have enough data? Deep learning to the rescue!
Conference paper