SOFAI-LM: A Cognitive Architecture for Building Efficient and Reliable Reasoning Systems with LLMs

Vedant Khandelwal; Francesca Rossi; Vishal Pallagani; Keerthiram Murugesan; Lior Horesh

AAAI 2026

Tutorial

20 Jan 2026

SOFAI-LM: A Cognitive Architecture for Building Efficient and Reliable Reasoning Systems with LLMs

Abstract

This lab shows participants how to build reliable, compute-efficient reasoning systems by instantiating SOFAI-LM, an LLM-first cognitive architecture that wraps a fast language model with a training-free metacognitive controller and selectively falls back to a slower reasoning model only when needed. Attendees will implement the full loop – evaluation, targeted feedback, iterative refinement, and principled fallback – in two contrasting domains: global constraint satisfaction in graph coloring and localized bug fixing in code debugging. We begin with a brief motivation from “fast and slow” reasoning and failure modes of raw LLM prompting, then introduce the SOFAI-LM controller: how to evaluate candidate solutions with domain-specific functions, design concise yet informative feedback, set iteration budgets, and log accuracy–time trade-offs. Participants will then run and modify a minimal SOFAI-LM pipeline using open models served through Ollama, either locally or via a Colab notebook, so that every step is reproducible on modest hardware. Hands-on segments guide attendees through (i) encoding graph coloring as a decision problem, using a correctness function that measures the fraction of properly colored edges and feedback that pinpoints conflicts, and (ii) debugging Python/C++ snippets using test-based evaluation, failure-focused feedback, and different ways of passing context to the slower model (problem only, best attempt, or full interaction history). Throughout, we highlight how choices about feedback style, memory, and fallback strategy change the accuracy–time curve. By the end of the lab, participants will have working templates for SOFAI-LM pipelines, ready-to-adapt evaluators and prompts for new domains, and practical checklists for configuring inference, ensuring that LLM-based systems are both fast and reliable under real-world compute budgets. The lab is designed for graduate students, researchers, and practitioners who are familiar with basic Python, Git, and prompt engineering, and who seek concrete patterns for transforming large language models into governed reasoning systems, rather than one-off prompts. Link – https://sofai-lm-aaailab.github.io/

Conference paper