AQORA: A Fast Learned Adaptive Query Optimizer with Stage-Level Feedback for Spark SQL

Published 12 Oct 2025 in cs.DB | (2510.10580v2)

Abstract: Recent studies have identified two main approaches to improve query optimization: learned query optimization (LQO), which generates or selects better query plans before execution based on models trained in advance, and adaptive query processing (AQP), which adapts the query plan during execution based on statistical feedback collected at runtime. Although both approaches have shown promise, they also face critical limitations. LQO must commit to a fixed plan without access to actual cardinalities and typically rely on a single end-to-end feedback signal, making learning inefficient. On the other hand, AQP depends heavily on rule-based heuristics and lacks the ability to learn from experience. In this paper, we present AQORA, a learned adaptive query optimizer that continuously refines query plans through stage-level feedback, unifying pre-execution and in-execution optimization within a single learning process. AQORA addresses the above challenges through four core strategies: (1) realistic feature encoding, (2) query stage-level feedback and intervention, (3) automatic strategy adaptation, and (4) low-cost integration. Experiments show that AQORA reduces end-to-end execution time by up to 90% compared to other learned methods and by up to 70% compared to Spark SQL's default configuration with adaptive query execution.

Abstract PDF Chat (Pro)

Summary

The paper introduces AQORA, which uses an actor-critic reinforcement learning framework to adapt query plans in real time.
It integrates a planner extension within Spark SQL, enabling dynamic adjustments to join orders and operator selections.
Experimental results demonstrate up to a 90% reduction in query execution time compared to traditional optimization methods.

Overview of AQORA: A Learned Adaptive Query Optimizer

AQORA is an innovative query optimizer designed specifically for Spark SQL. It distinguishes itself by employing reinforcement learning to refine query plans during execution, based on real-time feedback. This paper introduces AQORA, detailing its architecture, components, and the reinforcement learning framework it employs. AQORA minimizes execution time by adapting to runtime conditions, ultimately improving performance significantly over traditional methods like Spark SQL's default configuration and other learned query optimizers.

Architectural Design

Key Components

AQORA is designed with two principal components:

Decision Model: This component utilizes an actor-critic framework for reinforcement learning to evaluate and decide upon optimization actions. The actor network generates potential actions, which are refined by the critic network based on observed feedback.
Planner Extension: Integrated seamlessly into Spark SQL, this component captures runtime execution plans and employs the decision model's suggestions to optimize the query dynamically.
Figure 1: An overview of AQORA, which consists of two main components: (1) a decision model that generates optimization actions, and (2) an AQE planner extension that applies these actions and provides feedback to the decision model.

Execution-Time Adaptation

AQORA's novel approach involves leveraging staged feedback during query execution to dynamically refine execution plans. This feedback mechanism allows AQORA to adjust join orders and operator choices with fine-grained control, effectively balancing runtime performance and optimization accuracy. This approach is in contrast to static learned query optimizers that only influence plans before execution begins.

Figure 2: AQORA is the first learned query optimizer achieving execution-time query plan optimization on Spark SQL.

Reinforcement Learning Framework

Actor-Critic Framework

AQORA employs an actor-critic setup, where the actor suggests actions to modify query plans, and the critic evaluates the potential success of these actions. The framework's adaptability is facilitated through the following components:

State Representation: The state is defined by partial execution plans and their runtime statistics, such as true cardinalities.
Action Space: The available actions include enabling/disabling features like CBO, modifying join orders, and introducing broadcast hints, allowing the system to adapt in-flight.
Reward Signals: Feedback is captured at multiple stages during execution, with rewards shaped by combined execution metrics and performance improvements.

Policy Optimization with PPO

The policy is optimized using Proximal Policy Optimization (PPO), which balances exploration and exploitation through clipped probability ratios. This choice of algorithm ensures robust learning of the optimal query execution strategies.

Experimental Evaluation

Performance Evaluation

Comprehensive experimental evaluations on benchmarks like JOB, ExtJOB, and STACK demonstrate AQORA's superiority. AQORA achieves up to a 90% reduction in execution time compared to baseline methods, indicating its effectiveness in optimizing complex and dynamic queries.

Figure 3: Query performance on three benchmarks, in seconds. Black numbers outside bars represent end-to-end query execution time, while numbers in dark and light bars represent raw query time and optimization cost respectively.

Dynamic Adaptation

AQORA's ability to adapt its optimization strategies dynamically is a hallmark of its design. This feature enables AQORA to maintain performance even as underlying data distributions change, reflecting its robustness against evolving workloads.

Figure 4: The first row shows the performance of AQORA and Lero when trained on IMDb-1950 and IMDb-1980 and tested on the full IMDb dataset.

Conclusion

AQORA sets a new standard in query optimization for Spark SQL by integrating learned optimization strategies with execution-time feedback. Its reinforcement learning approach, coupled with effective system integration, achieves substantial performance improvement while maintaining adaptability and robustness across diverse query environments. Future work can explore expanding AQORA's capabilities to other SQL processing platforms and further refining its model architecture to cater to even more complex query optimizations.