Execution Description Language (EDL)

Updated 29 December 2025

EDL is a natural-language–style, stepwise intermediate representation designed to bridge NL queries and SQL by using explicit, numbered execution operators.
It decomposes query planning into two stages—NLQ-to-EDL and EDL-to-SQL—thereby reducing semantic drift and improving compositional accuracy.
EDL’s explicit operator mapping and tree structure enable practical query transformations and measurable performance gains in large-scale, cross-domain databases.

Execution Description Language (EDL) is a natural-language–style, stepwise intermediate representation for database query generation, specifically designed to mediate between a user's natural-language question (NLQ) and a corresponding SQL query. EDL is structured as an explicit, numbered list of execution operators, each mapping directly to a classical relational database operation. This approach systematizes query planning in neural text-to-SQL systems by decomposing semantic parsing into two discrete stages: NLQ-to-EDL and EDL-to-SQL. EDL has been introduced and formalized in the CRED-SQL framework to reduce semantic drift and improve compositional accuracy, especially in large-scale, cross-domain databases (Duan et al., 18 Aug 2025).

1. Formal Definition and Syntax

EDL adopts a formal, process-oriented syntax organized as a sequence of operator invocations, each associated with a particular execution step. The top-level grammar is provided in Backus–Naur Form (BNF) as follows:

$\begin{aligned} \langle \text{EDLDoc}\rangle\; &::=\; \langle \text{StepList}\rangle \ \langle \text{StepList}\rangle\; &::=\; \langle \text{Step}\rangle\,(\texttt{\n}\;\langle \text{Step}\rangle)^* \ \langle \text{Step}\rangle\; &::=\; \texttt{\#}\,\langle \text{StepNum}\rangle\,\texttt{.}\,\langle \text{OpInvocation}\rangle \ \langle \text{StepNum}\rangle\; &::=\; [\texttt{1}\!-\!\texttt{9}][\texttt{0}\!-\!\texttt{9}]* \ \langle \text{OpInvocation}\rangle\; &::=\; \langle \text{Operator}\rangle\;\langle \text{ArgList}\rangle \ \langle \text{Operator}\rangle &::=\; \texttt{ScanTable}\mid \texttt{Join}\mid \texttt{ReserveRows}\mid \texttt{GroupBy}\mid \texttt{HavingClause}\mid \texttt{Sort}\mid \texttt{Limit}\mid \texttt{SelectColumn}\mid \ &\quad\;\;\texttt{Subquery}\mid \texttt{SetOp}\mid \texttt{ArithmeticCalc}\mid \dots \ \langle \text{ArgList}\rangle &::=\; \text{free-form English describing table names, aliases, columns, conditions, etc.} \end{aligned}$

EDL plans are explicit trees, with leaf nodes typically instantiating ScanTable or Subquery, and internal nodes performing operations such as Join, ReserveRows (filter), GroupBy, or arithmetic calculation. Step referentiality enables chaining of prior computation by explicit step number.

A summary of core operators appears below.

Operator	Function	Example Usage
ScanTable	Retrieve table rows	Retrieve all rows from [city] as T1
Join	Join two tables	Join [Pets] as P on HP.[PetID] = P.[PetID]
ReserveRows	Filter rows	From #1, keep rows where ...
GroupBy	Aggregation grouping	Group #4 by [District]
HavingClause	Aggregate-level filtering	HAVING count(*) > 5
Sort	Sort rows	Sort by [Population], descending
Limit	Row count truncation	Limit to 10 rows
SelectColumn	Project columns	Select [major], [age] from #7
Subquery	Nested result	Retrieve all cities with ... as T2
SetOp	Set operation	Intersect #3 and #4
ArithmeticCalc	Computed columns	Compute avg_population as ...
DateCalculation	Temporal computation	Extract year from [BirthDate]
Cast	Type conversion	Cast [Amount] as float
Ranking	Row ranking	Rank students by GPA
SubstringExtraction	Substring from column	Extract prefix from [Name]
CaseStatement	Conditional value selection	CASE WHEN ... END

2. Exemplification of EDL for Query Planning

EDL plans are constructed for both simple and complex NLQs. Two illustrative transformations are given:

Example 1: Single-table Aggregation

NLQ: "Find the number of cities in each district whose population is greater than the average population of cities?"
EDL:

Scan Table: Retrieve all rows from the [city] table as T1.
Subquery: Retrieve all rows from the [city] table as T2.
Arithmetic Calculation: Compute avg_population as the average of T2.[Population].
Reserve Rows: From #1, keep rows where T1.[Population] > #3.avg_population.
Group By: Group #4 by the [District] column.
Select Column: Select count(*) as city_count from #5.

Example 2: Multi-table Join with Negation

NLQ: "Find the major and age of students who do not have a cat pet."
EDL:

Scan Table: Retrieve all rows from the [Student] table as S.
Scan Table: Retrieve all rows from the [Has_Pet] table as HP.
Join: Join [Pets] as P on HP.[PetID] = P.[PetID].
Reserve Rows: From #3, keep rows where P.[PetType] = 'cat'.
Select Column: Select HP.[StuID] from #4.
Reserve Rows: From #1, keep rows whose S.[StuID] is not in #5.
Select Column: From #6 select [major], [age].

These examples highlight EDL's explicit operator chaining and transparent tracking of dataflow and selection logic (Duan et al., 18 Aug 2025).

3. Mapping from NLQ to EDL: Model and Training

The NLQ-to-EDL process is cast as a supervised sequence generation task. Spider and Bird SQL annotations are automatically converted to EDL via GPT-4o, with database execution ensuring semantic alignment. Datasets of ⟨NLQ, gold-EDL⟩ pairs are constructed (Spider-EDL and Bird-EDL).

LLM Base: Qwen2.5-Coder-32B (open-source, code-specialized LLM)
Finetuning: LoRA, rank-8, learning rate $5 \times 10^{-5}$ , two epochs.
Prompt Template (Inference):

Task cue: "Translate the following question into an EDL plan."
Schema context (tables and columns).
Three to five few-shot examples (NLQ→EDL).
The query to parse.

The learning objective is next-token cross-entropy loss: $\mathcal{L}_{\text{T2E}}(\theta) = -\sum_{i=1}^N \log p_\theta(e_i \mid q_i, d_i)$ with $q_i$ the NLQ, $d_i$ the schema context, and $e_i$ the gold EDL. Inference uses autoregressive decoding with top-1 selection; beam search is optional.

A final consistency check (EDL→SQL→DB) for non-empty results can be used, but is rarely necessary in practice (Duan et al., 18 Aug 2025).

4. EDL-to-SQL Mapping: Deterministic and Model-Based Approaches

EDL-to-SQL conversion is implemented via two main strategies:

(a) Structure-to-sequence LLMs: The EDL (numbered steps) and schema are serialized as input; the output is SQL. The model is trained with a standard cross-entropy objective on ⟨EDL, SQL⟩ pairs.
(b) Deterministic template-based code: Each operator is mapped to its SQL clause via explicit pseudocode logic:

Initialize:
    SELECT_list ← []
    FROM_clause ← ""
    JOIN_clauses ← []
    WHERE_clause ← ""
    GROUP_BY ← []
    HAVING_clause ← ""
    ORDER_BY ← ""
    LIMIT ← None

For each step in EDL in ascending order:
    op, args ← parse(step)
    match op:
        case ScanTable(table T as alias A):
            FROM_clause ← f"FROM {T} AS {A}"
        case Join(table T as alias A on condition C):
            JOIN_clauses.append(f"JOIN {T} AS {A} ON {C}")
        case ReserveRows(condition C):
            if GROUP_BY not set:
                WHERE_clause ← f"WHERE {C}"
            else:
                HAVING_clause ← f"HAVING {C}"
        case GroupBy(columns cols):
            GROUP_BY ← cols
        case SelectColumn(col c):
            SELECT_list.append(c)
        case ArithmeticCalc(newcol nc = expr):
            SELECT_list.append(f"{expr} AS {nc}")
        case Sort(column c, order dir):
            ORDER_BY ← f"ORDER BY {c} {dir}"
        case Limit(n):
            LIMIT ← n
        case SetOp(type S, q1, q2):
            SQL ← f"({q1}) {S.upper()} ({q2})"

At end:
SQL ← (
    SELECT {", ".join(SELECT_list)}
    {FROM_clause}
    {" ".join(JOIN_clauses)}
    {WHERE_clause if any}
    {GROUP_BY ? "GROUP BY " + ",".join(GROUP_BY) : ""}
    {HAVING_clause if any}
    {ORDER_BY if any}
    {f"LIMIT {LIMIT}" if LIMIT else ""}
)

Faithfulness is obtained because the operator mapping is explicit and 1-to-1. Empirical results show >98% execution accuracy across LLMs when gold EDL is mapped to SQL (Duan et al., 18 Aug 2025).

5. Training Objectives and Evaluation Metrics

Supervised training is performed for both NLQ-to-EDL and EDL-to-SQL mappings, each using token-wise cross-entropy losses:

$\min_\theta\; \sum_{i=1}^{N}\;\mathcal{L}_{\mathrm{CE}}\left(p_\theta(\widetilde e_i\mid q_i,d_i),\,e_i\right) \,,\quad \min_\phi\; \sum_{i=1}^{N}\;\mathcal{L}_{\mathrm{CE}}\left(p_\phi(\widetilde s_i\mid \widetilde e_i,d_i),\,s_i\right)$

Evaluation uses two principal metrics:

Execution Accuracy (EX): Fraction of test queries for which the predicted execution matches the gold execution.

$\mathrm{EX} = \frac{1}{M}\sum_{i=1}^M \mathbf{1}\bigl(\mathrm{exec}(\widetilde s_i) = \mathrm{exec}(s_i)\bigr)$

Schema-retrieval Recall@k: Proportion of gold tables covered by the top-k retrieved tables.

$\mathrm{Recall}@k = \frac{|\{\text{gold tables}\}\cap \{\text{top-}k\text{ retrieved}\}|}{|\{\text{gold tables}\}|}$

These two metrics rigorously quantify both upstream schema selection effectiveness and end-to-end semantic fidelity (Duan et al., 18 Aug 2025).

6. Empirical Performance and Impact

The integration of EDL into CRED-SQL yields measurable gains over prior art, especially in large, complex schemas.

On SpiderUnion (GPT-4o), CLSR schema retrieval achieves recall@1 of 40.23% (vs. 8.51% for CRUSH) and recall@3 of 77.07% (vs. 30.56%).
End-to-end (Qwen2.5-Coder-32B): CRUSH+NLQ→SQL achieves 51.5% EX; CRED-SQL with EDL achieves 73.4% EX (+21.9 points).
Intermediate representation comparison (Spider, GPT-4o, DIN-SQL): NLQ→SQL 78.1% EX; NLQ→EDL→SQL 83.3% EX (+5.2 points).
On BirdUnion (MAC-SQL backbone, GPT-4o): CRED-SQL+EDL yields 58.28% EX (vs. 51.17% baseline, +7.11 points).

A full breakdown across upstream and open/closed LLMs shows that EDL consistently improves both schema selection and execution accuracy over direct SQL-generation or previous intermediate representations such as QPL. The adoption of EDL, especially when paired with strong schema retrieval as in CRED-SQL, substantially reduces semantic drift and ensures more faithful mappings in neural semantic parsing pipelines (Duan et al., 18 Aug 2025).

Markdown Report Issue Upgrade to Chat

References (1)

CRED-SQL: Enhancing Real-world Large Scale Database Text-to-SQL Parsing through Cluster Retrieval and Execution Description (2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Execution Description Language (EDL).

Execution Description Language (EDL)

1. Formal Definition and Syntax

2. Exemplification of EDL for Query Planning

3. Mapping from NLQ to EDL: Model and Training

4. EDL-to-SQL Mapping: Deterministic and Model-Based Approaches

5. Training Objectives and Evaluation Metrics

6. Empirical Performance and Impact

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Execution Description Language (EDL)

1. Formal Definition and Syntax

2. Exemplification of EDL for Query Planning

3. Mapping from NLQ to EDL: Model and Training

4. EDL-to-SQL Mapping: Deterministic and Model-Based Approaches

5. Training Objectives and Evaluation Metrics

6. Empirical Performance and Impact

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research