Demo paper

ToolSmith: A Multi-Agent Framework for Enterprise Tool Creation

Abstract

Although LLMs can generate tools for generic domains and tasks, they struggle with enterprise-related domains that in- volve proprietary APIs and data schemas. We present Tool- Smith, a framework for autonomously generating and validat- ing agent-compatible tools. Given an API specification and a Tool Specification Requirement (TSR), ToolSmith produces a tool function and verifies it through a closed-loop process: it creates natural language (NL) tests and executes the tool in a secure agent sandbox for validation. For state-changing tools, ToolSmith confirms outcomes by querying the API with pa- rameters derived from the NL tests. If the tool fails to produce the desired output, ToolSmith generates diagnostic feedback to iteratively regenerate it. By ensuring both functional cor- rectness and agent compatibility, ToolSmith enables reliable automation of enterprise workflows.