- The paper introduces AQORA, which uses an actor-critic reinforcement learning framework to adapt query plans in real time.
- It integrates a planner extension within Spark SQL, enabling dynamic adjustments to join orders and operator selections.
- Experimental results demonstrate up to a 90% reduction in query execution time compared to traditional optimization methods.
Overview of AQORA: A Learned Adaptive Query Optimizer
AQORA is an innovative query optimizer designed specifically for Spark SQL. It distinguishes itself by employing reinforcement learning to refine query plans during execution, based on real-time feedback. This paper introduces AQORA, detailing its architecture, components, and the reinforcement learning framework it employs. AQORA minimizes execution time by adapting to runtime conditions, ultimately improving performance significantly over traditional methods like Spark SQL's default configuration and other learned query optimizers.
Architectural Design
Key Components
AQORA is designed with two principal components:
- Decision Model: This component utilizes an actor-critic framework for reinforcement learning to evaluate and decide upon optimization actions. The actor network generates potential actions, which are refined by the critic network based on observed feedback.
- Planner Extension: Integrated seamlessly into Spark SQL, this component captures runtime execution plans and employs the decision model's suggestions to optimize the query dynamically.
Figure 1: An overview of AQORA, which consists of two main components: (1) a decision model that generates optimization actions, and (2) an AQE planner extension that applies these actions and provides feedback to the decision model.
Execution-Time Adaptation
AQORA's novel approach involves leveraging staged feedback during query execution to dynamically refine execution plans. This feedback mechanism allows AQORA to adjust join orders and operator choices with fine-grained control, effectively balancing runtime performance and optimization accuracy. This approach is in contrast to static learned query optimizers that only influence plans before execution begins.
Figure 2: AQORA is the first learned query optimizer achieving execution-time query plan optimization on Spark SQL.
Reinforcement Learning Framework
Actor-Critic Framework
AQORA employs an actor-critic setup, where the actor suggests actions to modify query plans, and the critic evaluates the potential success of these actions. The framework's adaptability is facilitated through the following components:
- State Representation: The state is defined by partial execution plans and their runtime statistics, such as true cardinalities.
- Action Space: The available actions include enabling/disabling features like CBO, modifying join orders, and introducing broadcast hints, allowing the system to adapt in-flight.
- Reward Signals: Feedback is captured at multiple stages during execution, with rewards shaped by combined execution metrics and performance improvements.
Policy Optimization with PPO
The policy is optimized using Proximal Policy Optimization (PPO), which balances exploration and exploitation through clipped probability ratios. This choice of algorithm ensures robust learning of the optimal query execution strategies.
Experimental Evaluation
Comprehensive experimental evaluations on benchmarks like JOB, ExtJOB, and STACK demonstrate AQORA's superiority. AQORA achieves up to a 90% reduction in execution time compared to baseline methods, indicating its effectiveness in optimizing complex and dynamic queries.
Figure 3: Query performance on three benchmarks, in seconds. Black numbers outside bars represent end-to-end query execution time, while numbers in dark and light bars represent raw query time and optimization cost respectively.
Dynamic Adaptation
AQORA's ability to adapt its optimization strategies dynamically is a hallmark of its design. This feature enables AQORA to maintain performance even as underlying data distributions change, reflecting its robustness against evolving workloads.
Figure 4: The first row shows the performance of AQORA and Lero when trained on IMDb-1950 and IMDb-1980 and tested on the full IMDb dataset.
Conclusion
AQORA sets a new standard in query optimization for Spark SQL by integrating learned optimization strategies with execution-time feedback. Its reinforcement learning approach, coupled with effective system integration, achieves substantial performance improvement while maintaining adaptability and robustness across diverse query environments. Future work can explore expanding AQORA's capabilities to other SQL processing platforms and further refining its model architecture to cater to even more complex query optimizations.