Structured Plan Representation
- Structured plan representation is a formal method that encodes tasks and queries as graphs or trees, capturing actions, dependencies, and semantic details for systematic reasoning.
- Methodologies like graph neural network encodings, LLM-based graph structuring, and logic-based pipelines enable efficient construction, verification, and execution in diverse domains.
- Empirical studies show that these representations improve cost estimation, plan validation, and bug detection in DBMS and robotics, guiding future integrations and optimizations.
Structured plan representation refers to the formalization of plans—sequences or graphs of actions, decisions, or operations—to enable systematic reasoning, efficient learning, and robust execution, particularly in domains spanning query optimization, procedure automation, AI planning, and robotics. Such representations encode structural, semantic, and dependency information, supporting both human understanding and computational manipulation. Recent work has advanced the field with techniques ranging from directed acyclic graphs (DAGs) for procedural workflows to bidirectional neural encodings of database plans.
1. Formal Models and Structural Principles
Structured plan representations distill the essential elements of complex tasks into precisely defined mathematical or logical structures. A fundamental example is the modeling of procedures or queries as rooted trees or DAGs, where nodes represent actions or operations and edges encode sequencing, data dependencies, or hierarchical relationships. In standardized procedure automation, SOPStruct models a procedure as a DAG where is the set of subtasks annotated with name, description, dependencies, inputs, outputs, and semantic category, while encodes ordered dependencies (temporal and data-flow) (Garg et al., 28 Mar 2025).
In the context of database management systems (DBMS), query execution plans are tree-structured, with nodes capturing operator type (e.g., scan, join), accessed tables, predicates, and estimated cardinalities. These plans must be encoded into fixed-size representations for use with downstream machine learning models for tasks such as cost estimation and plan selection (Chang et al., 2024, Ba et al., 2024).
Human cognitive planning is described using the framework of Markov Decision Processes (MDP), where representations are generated by abstraction—pruning states, actions, or transitions—leading to smaller induced MDPs that trade off between representational complexity and expected planning utility (Ho et al., 2021).
2. Methodologies for Structured Representation Construction
Multiple technical approaches exist for constructing structured plan representations, reflecting both the domain and intended computational goals.
Graph Neural Network Encodings
The BiGG model, developed for ML-based query optimization, encodes plan trees as directed graphs with bidirectional edges. Node features aggregate operator, table, and predicate embeddings. Message passing is implemented via parallel TransformerConv layers for both childparent and parentchild directions, with attention-based aggregation (Chang et al., 2024). The outputs are fused with a learnable gating mechanism and aggregated in post-order via a GRU, yielding a robust fixed-size vector encoding.
LLM-based Graph Structuring of Procedures
In SOPStruct, LLMs are prompted to segment unstructured SOPs, extract node attributes under a schema, and build an explicit DAG with annotated dependencies. The process combines deterministic semantic segmentation, schema-constrained subtask extraction, and explicit dependency enumeration, followed by global cycle detection (Garg et al., 28 Mar 2025).
Logic-based Translation and Inference
The TIC framework introduces a three-stage Translate-Infer-Compile pipeline, where LLMs generate an Answer Set Programming (ASP)-based intermediate representation. Logic inference completes missing information and expands intensional definitions, with a final compilation step yielding a ground PDDL plan (Agarwal et al., 2024). This modular process heads off typical LLM hallucinations by separating semantic abstraction from full plan instantiation.
Hierarchical and Multi-layered Graphs in Navigation
For robotics and embodied AI, STRIVE models spatial environments as multi-layered graphs combining viewpoint nodes (with 3D position/coverage), object nodes (with masks, classes), and room nodes (with semantic and topological attributes). This hierarchy supports both global room-level reasoning and fine-grained intra-room action selection (Zhu et al., 10 May 2025).
3. Encodings, Verification, and Soundness
Ensuring the soundness and utility of structured representations typically relies on formal verification methods and evaluation schemes.
Plan Formalism and Verification
SOPStruct converts its structured DAGs to PDDL, defining the execution semantics with custom predicates for variable availability and subtask dependencies. A classical planner checks executable soundness—no missing inputs, no cycles, and satisfaction of the output conditions—while an LLM scores semantic completeness against the original SOP's specification (Garg et al., 28 Mar 2025).
Logic Inference and Compilation
The TIC pipeline uses logic solvers not only to validate but also to expand the intermediate representation into a fully instantiated PDDL task. Deterministic rules enforce domain constraints (e.g., object cardinality, bijective relations) and enable legal plan compilation even in domains with extensive combinatorial structure (Agarwal et al., 2024).
Unified Query Plan Grammar
A unified query plan representation for DBMS testing and comparison adopts a strictly defined grammar (see Table 1), normalizing operator categories and node property types. This harmonization enables cross-system plan analysis, testing, and visualization (Ba et al., 2024).
| Component | Example Categories/Fields | Purpose |
|---|---|---|
| Operations | Producer, Join, Executor, Consumer | Structural roles in plan |
| Properties | Cardinality, Cost, Configuration, Status | Quantitative/meta annotations |
| Formats | JSON, XML, DAG, Text, Table | Serialization, visualization |
4. Empirical Evaluations and Comparative Outcomes
Empirical studies have quantified the impact of structured plan representations on downstream performance across multiple domains.
ML-based Query Optimization
On TPC-DS cost estimation and plan selection tasks (up to 10-join queries), BiGG achieves a median Q-Error of 1.762 (vs. 1.840 for Tree-LSTM) and Spearman correlation of 0.805 (vs. 0.783), indicating ∼4.3% error reduction and improved rank fidelity. In plan selection, BiGG delivers the lowest plan suboptimality (median 1.045) (Chang et al., 2024).
Procedural Graph Structuring
SOPStruct achieves 100% scores for PDDL solvability, dependency completeness, and inputs-from-dependency validation across three public SOP datasets. LLM-based initial/goal state checks exceed 95% validity and >92% plan completeness—substantially outperforming zero-shot and alternative schema-constrained LLM approaches (Garg et al., 28 Mar 2025).
Unified DBMS Plan Analysis
A unified query plan grammar enabled automated bug discovery across MySQL, PostgreSQL, and TiDB, revealing 17 previously unidentified bugs. Developer effort for parser construction was reduced by a factor of 4-5 compared to DBMS-specific tools (Ba et al., 2024).
5. Trade-offs, Generalizations, and Limitations
Structured plan representations embody domain- and algorithm-specific trade-offs in complexity, generality, and fidelity.
Complexity vs. Detail
As shown in human cognitive studies, representations that aggressively prune irrelevant details deliver efficiency but risk suboptimality if the abstraction omits critical effects. The optimal representation is selected via the objective , tuning utility against complexity cost (Ho et al., 2021).
Generality and Extensibility
Graph-based and grammar-driven representations (e.g., BiGG, UPlan) are extensible: new operator types, edge relations, and node attributes can be incorporated without wholesale rewriting. However, nuances of DBMS-specific features or unforeseen procedural constructs may require targeted extension (Chang et al., 2024, Ba et al., 2024).
Verification and Human Oversight
While formal soundness guarantees (from PDDL solvers or logic programming) ensure structural integrity, completeness and operational relevance may require LLM-based semantic validation or domain-expert review, particularly in ambiguous procedural texts (Garg et al., 28 Mar 2025).
Computational and Implementation Costs
Bidirectional message passing and attention in deep neural encodings incur higher inference time; logic-program-based approaches demand logic solver runtimes. Cross-system unification requires maintenance of converters as data or operator formats evolve (Chang et al., 2024, Agarwal et al., 2024, Ba et al., 2024).
6. Broader Implications and Future Directions
Structured plan representations constitute foundational infrastructure for data management, workflow automation, symbolic and neuro-symbolic AI planning, and embodied decision making.
Emerging trends include integrating heterogeneous edge encodings and residual connections for deeper graph plans, user-in-the-loop graphical editing, and meta-learning of abstraction selection. There is cutting-edge work on extending acyclic plan formalisms to cyclic (or HTN) models, embedding cost and efficiency objectives, and grounding abstract plans in real-world resource control for automated execution (Garg et al., 28 Mar 2025, Chang et al., 2024).
The field continues to converge foundational principles—semantic rigor, verifiability, extensibility, efficiency—with empirical validation across diverse high-impact applications, from database query optimization to robust robot navigation and explainable procedure automation.