Query Plan Language (QPL)

Updated 21 April 2026

Query Plan Language (QPL) is a formal, modular representation system for query execution plans, clearly defining operator trees and DAGs across diverse systems.
It employs standardized grammars like BNF and EBNF to specify both logical and physical plans, thereby facilitating semantic parsing and cross-engine analysis.
QPL supports advanced applications such as federated query execution, interactive plan repair, and detailed dataflow introspection, driving improved query optimization.

Query Plan Language (QPL) is a class of formally defined languages and representations designed to specify query execution plans in a structured, modular, and often DBMS-agnostic manner. QPLs have become central tools for research on semantic parsing, query optimization, cross-system plan analysis, federated query execution, and the design of explainable or compositional query interfaces. The various lines of work on QPL, as surveyed in the academic literature, encompass logical and physical plan representations, natural-language variants for NL2SQL, use in heterogeneous federations, and unified meta-representations for cross-database tooling (Eyal et al., 2023, Ba et al., 2024, Liu et al., 27 Oct 2025, Eyal et al., 2023, Zhang et al., 26 Aug 2025, Cheng et al., 2020).

1. Syntax and Formulations of QPL

The core design of QPL is to express query plans as explicit operator trees, DAGs, or step sequences, with operator semantics and operator categories precisely defined. Several varieties appear in the literature:

Relational/Algebraic QPLs: These define a finite set of core operators (e.g., Scan, Filter, Join, Aggregate, Sort, TopSort, Union, Intersect, Except) and represent queries as trees with nodes of the form #i = OP(...args...). Formal grammars are typically given in BNF or EBNF (Eyal et al., 2023, Eyal et al., 2023).
Physical Plan QPLs: To enable DBMS-agnostic introspection and testing, physical plans are modeled as trees of nodes, each with an operation label and a list of properties. The syntax is defined in EBNF, supporting categories such as Producer, Join, Combinator, Folder, Executor, Projector, Consumer, and extensible properties for cardinality, cost, configuration, and status. Serialization can be in text, JSON, or trees (Ba et al., 2024).
Natural-Language QPLs: For NL2SQL systems, plans can be free-form natural language step sequences, optionally labeled by reasoning type (e.g., "Counterfactual Condition"). Each step documents a subtask in reasoning, written to be interpretable by both LLMs and humans (Liu et al., 27 Oct 2025).
Analytical QPLs: Frameworks targeting non-SQL analytical workflows use operator languages that incorporate not only classical operators but also advanced analytics primitives (e.g., PCA, AnomalyDetect). These are specified in formal EBNF and rendered as typed operator trees or nested JSON (Zhang et al., 26 Aug 2025).
FedQPL for RDF Federations: The FedQPL dialect encodes logical plans for federated query execution, supporting abstract tree syntax built from request nodes (with member and access pattern), join/union (binary or n-ary), and variant "add" operations for triple/bgp extensions (Cheng et al., 2020).

The unifying feature is the explicit representation of dataflow, operator, and plan decomposition, supporting both programmatic and interactive or assistive usage.

2. Operator Semantics and Plan Execution

QPL semantics are designed to be precise, compositional, and portable:

Core Operators: Operators such as Scan, Filter, Join, Aggregate, and Set Operations (Except, Union, Intersect) are defined by tuple stream semantics. Filter applies a predicate, Join combines tuples on key conditions, Aggregate performs grouping and accumulation, and so on. Their mathematical semantics are given as set or multiset transformations, and most QPL formulations provide exact translation rules to equivalent SQL CTEs or SPARQL fragments (Eyal et al., 2023, Eyal et al., 2023).
Physical Plan Semantics: Physical QPLs annotate each node with estimated or observed properties, including cost, cardinality, and configuration (e.g., "Cost->startup:26150.38", "Configuration->HashCond:(t0.c0=t1.c0)"). These facilitate execution cost reasoning, visual debugging, and operator mapping across DBMSs (Ba et al., 2024).
Execution Workflow: QPL-based systems perform plan construction (e.g., via semantic parsing or plan translation), followed by execution either by translation to SQL or direct operator-by-operator data processing (especially for analytical QPLs outside standard DBMS engines) (Zhang et al., 26 Aug 2025). Evaluation may be incremental, operator-validated, and structure-aware.
Advanced Analytics: QPLs designed for complex analytics explicitly model operators such as PCA or AnomalyDetect, with semantics defined on matrices and supporting column-level specification of features (Zhang et al., 26 Aug 2025).

3. QPL in Semantic Parsing and Data Access

QPL has emerged as a target language for semantic parsing from natural language to database queries, driven by its modularity and compositional structure:

Text-to-QPL vs. Text-to-SQL: Empirical results show that targeting QPL as an intermediate representation yields higher execution accuracy for medium and hard queries relative to traditional text-to-SQL pipelines. Ablation studies indicate QPL's stepwise decomposition is more learnable for LLMs, particularly for compositional and nested queries (Eyal et al., 2023, Eyal et al., 2023).
Automatic and Interactive Decomposition: QPL supports the training of auxiliary models for question decomposition (QD), which break down complex questions into explicit sub-questions aligned with individual plan steps. This enables explainable semantic parsing and facilitates user understanding of generated queries (Eyal et al., 2023, Eyal et al., 2023).
Direct Plan Inspection and Repair: QPL plans can be presented as explicit graphs or sequences, enabling end-users (including non-programmers) to inspect, correct, and authorize plans before execution. Interactive iterative refinement using LLMs ("plan repair models") efficiently corrects erroneous subtrees or predicates (Eyal et al., 2023).
Error Analysis and User Studies: Human evaluations verify that QPL's modular, stepwise output is more interpretable and verifiable by users than monolithic SQL code, with correctness judgment rates significantly improved (+33 pts) for QPL (Eyal et al., 2023).

4. QPL for Federated and Heterogeneous Systems

QPL is effective for describing and optimizing queries over federated or heterogeneous data sources:

FedQPL (RDF Federations): FedQPL introduces a logical plan language tailored to source selection and execution planning across federations with diverse interface capabilities (e.g., triple pattern, BGP, brTPF, SPARQL endpoint). Its formal semantics and equivalence rules permit rigorous reasoning about minimal-cost source assignments and optimized plan rewritings (Cheng et al., 2020).
Complexity Results: The minimal source selection problem for FedQPL is NP-hard and in $\Sigma_2^P$ , with formal reductions provided. FedQPL is strictly more expressive for source assignment than previous restricted approaches (Cheng et al., 2020).
Plan Rewriting and Optimization: FedQPL offers a rich algebra of plan rewrites, enabling optimizers to exploit interface features (e.g., join pushdown, union flattening) and to transform logical plans into cost-efficient, federated execution strategies (Cheng et al., 2020).

5. QPL for Database-Agnostic Plan Analysis and Tooling

A contemporary line of work advocates for QPL as a universal plan interchange and introspection format:

Conceptual Component	Description	Example Categories
Operations	Execution steps, partitioned by function	Producer, Combinator, Join, Folder
Properties	Metadata over operations and plans (cardinality, cost, configuration)	Cardinality->rows, Cost->total, etc.
Formats	Serialization targets for storage/visualization (text, JSON, XML, YAML)	Text (EXPLAIN), JSON, Graph, Table

Physical Plan Canonicalization: By translating DBMS-specific physical plans into a normalized QPL structure, cross-engine plan analysis, plan-based testing (e.g., QPG, CERT), and unified visualization become straightforward (Ba et al., 2024). Plan properties can be compared structurally across engines, revealing optimization gaps.
Mapping Algorithms: Parsers and converters from PostgreSQL, MySQL, and other engines are specified as recursive traversals with mapping tables for operation and property normalization. Unknown operators or properties are forward-compatible and ignored by generic tools (Ba et al., 2024).
Applications: QPL underlies plan-based test oracles, cross-engine benchmark analyses (e.g., TPC-H plans), and generic visualization tools (e.g., one QPL backend replaces five DBMS-specific parsers in PEV2) (Ba et al., 2024).

6. Extensions, Limitations, and Future Directions

QPL research continues to explore novel extensions and address limitations:

Plan Diversification and Robustness: For natural-language reasoning, plan diversification (sampling multiple candidate QPLs and majority voting for answer agreement) improves execution robustness and mitigates planner stochasticity (Liu et al., 27 Oct 2025).
Feedback-Guided Prompting: Integrating feedback-guided, meta-prompted guidelines into the QPL planner's system prompt enables bootstrapped correction of systematic planner errors, generalizing to novel substructures with minimal orchestration overhead (Liu et al., 27 Oct 2025).
Multilingual and Cross-Entity Linking: QPL-based systems for bilingual NL2SQL explicitly encode entity-variant lines to mitigate transliteration and schema mismatch, ensuring correct plan grounding across languages (Liu et al., 27 Oct 2025).
Coverage and Extensibility: Physical QPLs support seamless extension to new operator types (e.g., GPUJoin, WindowFunction) by grammatically extending operation categories and mapping tables. Compatibility is maintained by ignoring unknown identifiers in generic consumers (Ba et al., 2024). However, textual or bytecode-only plans may require bespoke parsing for QPL normalization.
Natural Language Limitations: Natural-language QPLs can become verbose for highly complex queries, potentially exceeding prompt or context window limits; plan verification remains informal for free-form plans, and majority voting increases execution cost (Liu et al., 27 Oct 2025).

A plausible implication is that QPLs will underpin next-generation research both in the interpretability of semantic parsing systems and in the generalization of plan-based tooling across the rapidly evolving database and LLM landscapes.