QMaxCal Framework: Path-Space KL Regularization
- QMaxCal is a quantum control framework that regularizes open-system dynamics using path-space KL divergence, leveraging Girsanov’s theorem.
- It introduces Wiener-KL and drift-variance regularizers to penalize controls that enhance environmental decoherence, enabling differentiable optimization.
- Empirical tests on quantum benchmarks demonstrate up to 50% infidelity reduction and improved robustness under noise-model mismatches.
QMaxCal (“Quantum Maximum Caliber”) is a path-space Kullback–Leibler (KL) regularization framework for open quantum system control problems under decoherence. Its principled regularizers leverage Girsanov’s theorem to penalize controls that lead to trajectories with enhanced observable effects of the environment, thus driving the system toward states or subspaces with minimal decoherence. QMaxCal introduces two KL-based regularization terms—the Wiener-KL and drift-variance regularizers—complementing standard control fluence penalties, and produces closed-form, differentiable estimators for use in gradient-based and reinforcement learning optimization of time-dependent controls (Moody et al., 18 Jun 2026).
1. Open Quantum Control in Path Space
When a quantum system interacts with its environment, continuous monitoring of decoherence channels induces stochastic pure state trajectories governed by the stochastic Schrödinger equation (SSE). Each monitored decoherence channel yields a classical trajectory
where the drift encodes the effect of the decoherence operator and is the Wiener increment. Different control protocols produce distinct drifts but are subject to the same environmental noise realizations.
Girsanov’s theorem provides a mechanism to relate the path probability distributions generated by different control protocols acting on the same open quantum system, by expressing a closed-form Radon–Nikodym derivative and KL divergence between their associated ensembles of measurement records. The QMaxCal framework exploits this result to regularize open-system control strategies, explicitly penalizing the projected impact of the system’s evolution onto the decoherence channels.
2. Girsanov-Based Path-Space KL Divergence
The classical version of Girsanov’s theorem addresses diffusions of the form
within a shared Wiener-noise probability space. The key result is that the KL divergence between trajectories generated by two drifts and is
0
where 1. In the quantum-trajectory case, measurement records for each decoherence channel inherit this structure: under controls 2, the pathwise records are diffusions with drifts 3. The relative entropy reads
4
Selecting appropriate reference measures produces motivated regularizers for quantum control objectives.
3. Regularizers: Wiener-KL and Drift-Variance
QMaxCal defines two primary path-space regularizers for a control protocol parameterized by 5:
| Regularizer | Reference Process | Penalty Formulation |
|---|---|---|
| Wiener-KL (6) | Brownian motion (7) | 8 |
| Drift-variance (9) | Constant-drift process (0) | 1 |
- Wiener-KL (2): The reference is pure Brownian motion (zero drift). This regularizer penalizes the mean-square drift, incentivizing trajectories to approach the joint kernel 3—the “dark” or decoherence-free states under all channels.
- Drift-variance (4): The reference is the best-fit constant-drift process, minimizing over 5. For each channel, the optimal 6. The penalty quantifies the temporal variance of each drift about its mean, vanishing exactly for decoherence-free subspaces (DFS) with constant 7.
These KL-derived penalties differ qualitatively from standard fluence or pulse-smoothness regularization by acting directly on time-resolved observables of the system-environment interaction rather than on control differentiability or bandwidth.
4. Augmented Control Objective and Derivatives
QMaxCal’s objective for a state-transfer task from 8 to 9 at fixed 0 is
1
where 2, 3 is an optional fluence constraint, and expectations are over sampled SSE trajectories.
The gradients of the objective are computed by backpropagation through the sampled trajectories. For the regularizers:
- 4
- 5
The derivatives 6 are computed via automatic differentiation through the SSE numerical integrator, which also yields gradients for the fluence term.
5. Optimization Protocol
The gradient-based QMaxCal algorithm proceeds as follows:
- Parameter initialization: E.g., Fourier coefficients for each control channel.
- Trajectory sampling: Integrate the SSE for 7 trajectories 8 under 9.
- Observable accumulation: For each trajectory, record final-state fidelity and drifts 0.
- Estimator calculation: Compute sample means for the objective, 1, 2, and fluence.
- Objective and gradient computation: Use automatic differentiation to obtain 3 and 4.
- Parameter update: Apply a gradient descent step (e.g., Adam).
A reinforcement learning adaptation uses, e.g., PPO with the negative of the regularized fidelity as the reward.
6. Empirical Performance Across Quantum Benchmarks
QMaxCal was evaluated on five representative open quantum system benchmarks, with consistent comparison to unregularized gradient-based trajectory optimizers and RL-based PPO baselines:
- Single-Qubit Amplitude Damping: 5 (with 6) contracted SSE-trajectory population variance by 7 at 8 (from 9 to 0) and achieved up to 1–2 percentage point fidelity improvement (3 infidelity reduction). Drift-variance was less effective here.
- STIRAP (Λ system): At 4, 5 reduced peak 6-state population by 7 (from 8 to 9) and time-integrated exposure by 0, maintaining fidelity near 1. PPO baseline degraded to 2 fidelity.
- Diamond Four-Level System: Baseline fidelity of 3 (with 4 leakage) was improved to 5 by 6 (7 pp). Under 8 noise-model mismatch, 9 maintained 0 (1 pp over baseline).
- Four-Qubit Chain: With asymmetric dephasing (2), 3 shifted final-state fidelity from 4 (baseline) to 5 (6 pp, 7 infidelity reduction).
- IBM Kingston Six-Qubit Chain: Baseline fidelity 8 (with 9); drift-variance (0) reached 1 (2 infidelity reduction); 3 slightly trailing; PPO baseline 4.
These results demonstrate that QMaxCal regularization efficiently steers trajectories into decoherence-avoiding subspaces and enhances both final-state fidelity and robustness to noise-model mismatch. Gains of up to 5 infidelity reduction and 6–7 percentage point fidelity boost under noise-model mismatch are reported.
7. Distinguishing Features and Theoretical Context
QMaxCal’s principal innovation is the construction of differentiable path-space KL regularizers for open quantum dynamics, grounding the control penalty in observable statistics of the decoherence channels. Unlike conventional penalties on control fluence or smoothness, which limit total pulse energy or bandwidth, QMaxCal’s terms directly penalize the cumulative environmental “visibility” of noise-induced drift, providing physical interpretability and task relevance. The Wiener-KL regularizer drives evolution into the joint kernel of the Lindblad terms, while the drift-variance identifies any decoherence-free subspace, and is effective even when no joint kernel exists.
A plausible implication is that QMaxCal can substantially improve outcome fidelity in realistic quantum hardware scenarios, especially where noise model mismatch or complex open-system structure obviates reward shaping and prior-based regularization.
For further derivations and technical details, see (Moody et al., 18 Jun 2026) (Appendices B–F).