ST-Raptor: LLM-Powered Semi-Structured Table Question Answering

Published 25 Aug 2025 in cs.AI, cs.DB, and cs.IR | (2508.18190v3)

Abstract: Semi-structured tables, widely used in real-world applications (e.g., financial reports, medical records, transactional orders), often involve flexible and complex layouts (e.g., hierarchical headers and merged cells). These tables generally rely on human analysts to interpret table layouts and answer relevant natural language questions, which is costly and inefficient. To automate the procedure, existing methods face significant challenges. First, methods like NL2SQL require converting semi-structured tables into structured ones, which often causes substantial information loss. Second, methods like NL2Code and multi-modal LLM QA struggle to understand the complex layouts of semi-structured tables and cannot accurately answer corresponding questions. To this end, we propose ST-Raptor, a tree-based framework for semi-structured table question answering using LLMs. First, we introduce the Hierarchical Orthogonal Tree (HO-Tree), a structural model that captures complex semi-structured table layouts, along with an effective algorithm for constructing the tree. Second, we define a set of basic tree operations to guide LLMs in executing common QA tasks. Given a user question, ST-Raptor decomposes it into simpler sub-questions, generates corresponding tree operation pipelines, and conducts operation-table alignment for accurate pipeline execution. Third, we incorporate a two-stage verification mechanism: forward validation checks the correctness of execution steps, while backward validation evaluates answer reliability by reconstructing queries from predicted answers. To benchmark the performance, we present SSTQA, a dataset of 764 questions over 102 real-world semi-structured tables. Experiments show that ST-Raptor outperforms nine baselines by up to 20% in answer accuracy. The code is available at https://github.com/weAIDB/ST-Raptor.

Abstract PDF Upgrade to Chat

Authors (9)

Summary

The paper introduces the HO-Tree, a novel hierarchical structure that effectively represents semi-structured table layouts for QA tasks.
It presents a pipeline that decomposes complex questions into manageable sub-questions using LLM-guided tree operations.
Experiments on the SSTQA dataset reveal up to a 20% improvement in accuracy over baselines, demonstrating the framework's robust performance.

ST-Raptor: LLM-Powered Semi-Structured Table Question Answering

The paper "ST-Raptor: LLM-Powered Semi-Structured Table Question Answering" focuses on addressing the complexities associated with question answering (QA) over semi-structured tables. These tables, often found in real-world applications like financial reports and medical records, pose unique challenges due to their flexible and complex layouts. This work introduces the ST-Raptor framework, leveraging a novel tree-based architecture termed Hierarchical Orthogonal Tree (HO-Tree) to manage and interpret these complex table structures.

Overview of ST-Raptor Framework

The ST-Raptor framework aims to automate the interpretation of semi-structured tables and answer natural language questions efficiently. The key innovation is the HO-Tree representation, which captures headers, content values, and their implicit relationships within the table. This allows for precise manipulation and querying of the table data.

Figure 1: The ST-Raptor Architecture.

Key components of the ST-Raptor framework include:

HO-Tree: Constructs a multi-level tree structure to represent table layouts, enabling hierarchical and orthogonal relationships to be modeled effectively.
Basic Tree Operations: These are designed to guide LLMs in executing common QA tasks by parsing and manipulating these tree structures.
Pipeline Generation: The framework decomposes complex user questions into simpler sub-questions, generating corresponding tree operation pipelines for effective execution.

Implementation and Application

HO-Tree Construction

The construction of the HO-Tree is crucial for representing semi-structured tables. It involves:

Meta Information Detection: Using vision-LLMs (VLMs) to identify headers and other semantic markers within a table.
Table Partitioning Principles: Utilizing structural principles to segment tables into hierarchical and orthogonal components.
Depth-First Search (DFS) Algorithm: This algorithm builds the HO-Tree by recursively traversing the table's layout, ensuring comprehensive structural representation.

Operation Set for QA Tasks

A set of atomic operations is defined for querying and manipulating the HO-Tree:

Data Retrieval (e.g., CHL, FAT): Fetch nodes based on tree hierarchies.
Data Manipulation (e.g., Cond, Math): Apply predicates or perform calculations on retrieved data.
Align and Reason Operations: Ensure alignment between question semantics and tree content, and leverage LLM reasoning to derive answers.

Question Decomposition

The ST-Raptor framework applies question decomposition to handle complex multi-hop queries:

Semantic Alignment: Ensures operations align with the table's meta-information.
Column-Type Aware Tagging: Enhances data retrieval accuracy by categorizing columns based on data characteristics.

Experimental Evaluation

The paper evaluates ST-Raptor on the SSTQA dataset, comprising 764 questions over 102 semi-structured tables. It significantly outperforms nine baseline models, demonstrating up to 20% higher answer accuracy.

Figure 2: Error Distribution -- Analyzing model performance across different table formats and processing methods.

Key findings show that ST-Raptor's hybrid retrieval strategy (top-down and bottom-up) and its two-stage verification mechanism bolster its robustness and accuracy in QA tasks.

Conclusion

ST-Raptor effectively addresses the challenges in semi-structured table QA by integrating HO-Tree representation and LLM-powered operations. Its novel approach to question decomposition and answer verification sets a new benchmark for efficiency and accuracy. Future work may explore enhancing the framework's scalability and exploring novel architectures for even more complex data scenarios.

Markdown Report Issue