AGENTIQL: Multi-Agent Text-to-SQL Framework
- AGENTIQL is a modular framework for text-to-SQL conversion, decomposing tasks into dedicated reasoning, coding, and refinement agents.
- Its adaptive routing selects between a multi-agent pipeline and a baseline parser to optimize accuracy and efficiency based on query complexity.
- Parallel execution and clear intermediate outputs ensure interpretable, scalable performance that approaches GPT-4 SOTA using smaller open-source models.
AGENTIQL is an agent-inspired multi-expert framework for text-to-SQL generation designed to address limitations in monolithic LLM architectures, particularly for complex queries and diverse database schemas. By modularizing the semantic parsing process into specialized agents—reasoning, coding, and refinement—AGENTIQL supports interpretable, parallelizable, and scalable SQL query synthesis. The framework integrates an adaptive router that dynamically balances accuracy and efficiency, allowing selective invocation of the modular pipeline or a baseline parser. AGENTIQL advances execution accuracy and transparency, approaching GPT-4-level state-of-the-art (SOTA) performance on Spider with substantially smaller open-source LLMs.
1. Agent-Inspired Modular Architecture
AGENTIQL’s architecture decomposes the text-to-SQL task into clearly defined agent roles:
- Reasoning Agent: Given the input (natural language query and database schema ), the reasoning agent prunes irrelevant tables:
It then decomposes the query into sub-questions:
- Coding Agent: For each sub-question , the coding agent generates candidate SQL queries:
An iterative refinement loop modifies faulty sub-query outputs:
- Refinement Agent (Column Selection): After merging sub-queries into using ,
the column selection agent aligns the output via:
This explicit decomposition enables interpretability, modularity, and targeted debugging, facilitating fine-grained error isolation and improved query synthesis in complex scenarios.
2. Adaptive Routing Mechanism
AGENTIQL employs an adaptive router to select between its modular agentic pipeline and a baseline direct parser. Selection is performed using explicit metrics such as schema complexity (e.g., number of tables mentioned) and may be augmented via learned decision functions (e.g., XGBoost classifiers trained on query features). The router logic ensures:
- Complex queries: Routed to the modular pipeline for detailed reasoning, decomposition, and schema analysis.
- Simple queries: Handled by the baseline parser for lower-latency, direct mapping.
This approach balances resource utilization and execution accuracy, expending computational effort only where it yields quantifiable gains and thereby supporting enterprise-scale deployments.
3. Parallel Execution and Scalability
Several stages in AGENTIQL’s pipeline are intentionally designed to be parallelizable:
- Sub-question SQL generation can be performed concurrently for all .
- Table filtering, coding, and aspects of the merge/refine phase do not require strict sequentiality.
- Column selection can also be distributed when multiple candidate queries are generated.
By exploiting hardware parallelism (such as multi-GPU clusters), AGENTIQL achieves reduced latency and increased throughput, accommodating large-scale workloads typical of multi-database or federated query scenarios.
4. Performance and Evaluation
AGENTIQL demonstrates strong empirical results on the Spider benchmark:
| Model Size | Merging Strategy + CS | Execution Accuracy (EX) | Comparator (GPT-4 SOTA EX) |
|---|---|---|---|
| 14B | Planner&Executor + CS | 86.07% | 89.65% |
Key findings:
- The modular pipeline yields up to 9% EX improvement over one-step baselines on high-complexity queries.
- CS (column selection) refinement delivers an additional 2–5% EX gain.
- AGENTIQL narrows the performance gap to GPT-4-based SOTA while using models orders of magnitude smaller (14B open-source LLMs).
The router’s efficacy is critical—misrouted queries incur execution accuracy drops, underscoring the need for reliable complexity detection and frontend selection.
5. Transparency and Interpretability
AGENTIQL prioritizes transparency by exposing all intermediate reasoning steps:
- Query decomposition is explicitly traceable.
- Sub-query SQL candidates are available for inspection, facilitating semantic error identification.
- Separate refinement for column selection provides clear audit trails of SELECT clause alignment.
This modularity enables stakeholders to diagnose and correct errors at each stage, offers a foundation for explainable query synthesis, and inherently supports debugging and extension for new database domains.
6. Mathematical Formalizations
Key formulations derived from the original work:
- Dataset Construction: , pairing NL queries and schemas with target SQL.
- Table Filtering: .
- Decomposition: .
- Iterative SQL Generation: .
- Merging: .
- Column Selection: .
These expressions enable precise discussion and implementation of each pipeline segment.
7. Summary and Implications
AGENTIQL’s agent-inspired multi-expert design achieves scalable, interpretable, and efficient text-to-SQL generation. By modularizing complex reasoning and synthesis, employing adaptive routing, and enabling parallel execution, the framework matches state-of-the-art accuracy with substantially reduced model size and improved transparency. This architecture is well-suited for both research and applied enterprise semantic parsing, and its separation of reasoning, generation, and refinement provides a template for extensible pipeline design in other domains where interpretable agentic workflows are desired.