PDE-Bench: Benchmarking PDE Boundary Control
- PDE-Bench is a comprehensive open-source suite for benchmarking data-driven and model-based boundary control of PDEs.
- It offers standardized Gym-style environments simulating 1D transport, reaction–diffusion, and 2D Navier–Stokes control problems with rigorous numerical solvers.
- The suite enables direct comparison between classical controllers and RL algorithms, highlighting key issues like stability, control cost, and scalability.
PDE-Bench
PDE-Bench is a comprehensive, open-source benchmarking suite designed for data-driven, model-free boundary control and reinforcement learning (RL) of partial differential equations (PDEs). It provides standardized environments simulating canonical PDE control problems with a focus on boundary actuation, filling a critical gap in existing scientific machine learning and control benchmarks. The framework supports direct comparison of off-the-shelf RL algorithms against classical, model-based controllers (e.g., backstepping, adjoint optimization) using reproducible Gym-style APIs, rigorous numerical solvers, and prescribed performance metrics.
1. Concept and Rationale
PDE-Bench distinguishes itself by targeting the boundary control of canonical PDEs, exposing standardized environments for the 1D transport (hyperbolic), 1D reaction–diffusion (parabolic), and 2D incompressible Navier–Stokes equations. This focus on boundary control is motivated by the prevalence of physical systems that restrict actuation and sensing to the boundary, such as flow control via wall jets, thermal regulation at domain ends, or ramp metering in traffic networks. Conventional benchmarks typically address either supervised learning of PDE solution datasets or control via distributed in-domain actuation, not boundary actuation. PDE-Bench resolves this with Gym-compatible environments that enable RL agents to interact with the PDE via boundary inputs, supporting both classic and data-driven control paradigms (Bhan et al., 2024).
2. Benchmark Problems and Formulations
PDE-Bench provides three foundational boundary control environments:
| PDE | Governing Equation | Actuation Type |
|---|---|---|
| 1D Transport (hyperbolic) | Dirichlet at | |
| 1D Reaction–Diffusion (parabolic) | Dirichlet at | |
| 2D Navier–Stokes (incompressible) | , | Dirichlet (top wall) |
Each problem is precisely defined with explicit coefficients, boundary and initial conditions, and includes both a reference model-based controller for comparison and black-box RL interfaces. For example, the 1D transport includes a recirculation term , Dirichlet actuation at , and initial conditions sampled uniformly on . The 1D reaction–diffusion introduces instability via . For each, the full-state is observable and can be discretized to arbitrary resolution. The 2D Navier–Stokes domain is controlled via the top-wall input .
3. Environment API and RL Integration
PDE-Bench environments follow a Markov Decision Process (MDP) abstraction:
- State (Observation):
- For 1D PDEs: discretized state on grid points.
- For 2D Navier–Stokes: grid of velocity field components.
- Supports partial observation (e.g., boundary-only sensors).
- Action:
- Scalar in bounded interval (e.g., for transport, for NS).
- Actuation can be Dirichlet or Neumann at either boundary in 1D; flexible wall selection in 2D.
- Reward Functional:
- Stabilization: .
- Episode termination: reward depends on whether is below , with additional terms penalizing control effort.
- Tracking (Navier–Stokes): .
The environment exposes reset (samples initial condition, sets PDE state) and step (applies action, advances PDE, returns next state, reward, done flag) methods, with support for fixed episode lengths (5 s for transport, 1 s for reaction–diffusion, 0.2 s for NS) (Bhan et al., 2024).
4. Numerical Solvers and Implementation
The suite includes robust, black-box solvers for each PDE:
- 1D Transport: First-order upwind finite differences; explicit Euler time stepping with .
- 1D Reaction–Diffusion: Second-order central differences for ; explicit Euler with strict CFL criterion ().
- 2D Navier–Stokes: Fractional-step predictor–corrector scheme; iterative Poisson solver for pressure, second-order central differences, , .
The environments are implemented in Python with open-source code and extensive documentation, providing reproducible baselines for RL and model-based controllers, including example training scripts and configuration files for integration with RL libraries (Stable-Baselines3) (Bhan et al., 2024).
5. Benchmarked Algorithms and Hyperparameters
PDE-Bench evaluates standard model-free RL algorithms:
- Proximal Policy Optimization (PPO): On-policy, clipped surrogate objective, batch size 64, learning rate , .
- Soft Actor-Critic (SAC): Off-policy, batch size 256, automatic entropy tuning, .
- Policy/Value Networks: Two hidden layers, 64 units each, ReLU activation.
All algorithms use unmodified, standard architectures; no problem-specific inductive biases are included, but future extensions (e.g., enforcing Lipschitz continuity or physics priors) are noted as open directions (Bhan et al., 2024).
6. Performance Metrics and Empirical Results
Performance is evaluated both during training (mean 95% confidence interval of episode return vs. steps) and at test time (returns, stability, and control cost averages over 50 random initializations). Key findings include:
| Task | Backstepping | SAC | PPO |
|---|---|---|---|
| 1D Transport | 246.3 | 184.2 | 172.3 |
| 1D Reaction–Diff. | 299.1 | 229.1 | 293.3 |
| 2D NS (tracking, lower is better) | -7.93(opt.) | -17.83 | -5.37 |
- Model-based controllers achieve higher returns and lower control effort, and demonstrate provable closed-loop stability.
- RL controllers can stabilize and track the PDE, but exhibit higher control cost due to larger oscillations in the boundary inputs and require higher sample/compute budgets.
- PPO demonstrates more stable convergence in 1D domains than SAC; both reach stabilizing solutions within training steps (Bhan et al., 2024).
7. Insights, Limitations, and Directions
PDE-Bench offers several insights and exposes challenges unique to data-driven PDE control:
- Standard RL is effective for boundary feedback stabilization/tracking, but is not control-energy-optimal without explicit model-informed design.
- Reward design and spatial/temporal discretization are critical for stability and sample efficiency. Poor design can result in numerically unstable or inefficient learning.
- The current suite is limited to time-invariant deterministic coefficients and full observability; RL policies are prone to oscillatory actuation.
- Scaling to higher Reynolds numbers or strongly nonlinear PDEs remains a challenge, with potential remedies including regularization and more sophisticated solver integration.
- Extensions under consideration include adaptive RL for time-varying/parametric PDEs, learning-based observer design for partial observability, PINN-based RL acceleration, algorithmic diversity (DDPG, TD3, model-based RL), and curriculum learning (Bhan et al., 2024).
Code, problem definitions, numerical engines, classical control implementations, and baseline RL agent scripts are available at https://github.com/lukebhan/PDEControlGym with accompanying documentation at https://pdecontrolgym.readthedocs.io.
References:
PDE Control Gym: A Benchmark for Data-Driven Boundary Control of Partial Differential Equations (Bhan et al., 2024)