Neural Solvers for Power Flow

Updated 21 January 2026

Neural solvers are ML models that map system controls to state variables using physics-based residual minimization to ensure AC feasibility.
They integrate techniques such as test-time adaptation and physics-informed loss functions to achieve significant speedups and error reductions.
These models enable flexible, modular integration in optimization workflows like Predict-then-Optimise for practical power system applications.

Neural solvers are machine learning-based surrogates, proxies, or approximations designed for solving or approximating solutions to power flow (PF) and optimal power flow (OPF) problems—tasks that underlie operational and planning analyses in electrical power systems. These neural surrogates, leveraging deep neural architectures such as graph neural networks (GNNs) or feedforward NNs, offer significant improvements in computational speed over classical iterative solvers. Recent developments focus on ensuring not only speed but also physical consistency, flexibility across tasks, and AC-feasibility by formulating and minimizing physics-based residuals rooted in power system laws (Stiasny et al., 14 Jan 2026, Dogoulis et al., 27 Nov 2025, Za'ter et al., 17 Oct 2025).

1. Mathematical Foundations of Neural Solvers

Neural solvers fundamentally address the mapping from system controls (e.g., power injections, setpoints) to state variables (e.g., bus voltages, angles) in a manner consistent with the underlying nonlinear and nonconvex AC power flow (PF) equations. The nodal power-balance equations for an $n$ -bus network with admittance matrix $Y_{ij}=G_{ij}+jB_{ij}$ are

$P_i(V,\theta) =\sum_{j=1}^n |V_i||V_j|\bigl(G_{ij}\cos(\theta_i-\theta_j)+B_{ij}\sin(\theta_i-\theta_j)\bigr),$

$Q_i(V,\theta) =\sum_{j=1}^n |V_i||V_j|\bigl(G_{ij}\sin(\theta_i-\theta_j)-B_{ij}\cos(\theta_i-\theta_j)\bigr),$

where $(V, \theta)$ are the voltage magnitudes and angles (Dogoulis et al., 27 Nov 2025, Stiasny et al., 14 Jan 2026).

The Residual Power Flow (RPF) formulation systematically encodes these constraints via residual vectors built from both Kirchhoff's Current Law (KCL) and Kirchhoff's Voltage Law (KVL). For state vector $v$ and controls $u$ , the KCL and KVL residuals are assembled as real vectors whose stacked form determines feasibility, i.e., a state-control pair $(v,u)$ is AC-feasible if and only if $r(v,u) = 0$ (Stiasny et al., 14 Jan 2026).

2. Residual-Based Learning and Surrogate Training

In neural solvers, the central training target is often a residual norm, typically

$\rho(v, u) := \frac{1}{2} \| r(v, u) \|_2^2,$

where $r(v, u)$ stacks all active/reactive power as well as KVL residuals, and minimization of $\rho$ under varying $u$ yields the "best effort" or slack-adjusted AC-feasible solution (Stiasny et al., 14 Jan 2026). This residual-centric perspective enables continuous feasibility quantification, bypassing discrete bus-type distinctions and supporting a more flexible, reusable modeling interface.

Feature maps $\phi: \mathbb{R}^m \to \mathbb{R}^F$ : These may be linear or implemented as small feedforward NNs, and the main prediction then takes the form $\hat{v}(u;\theta) = A \phi(u)$ , with parameter vector $\theta$ . Training minimizes mean squared error (MSE) between predicted and reference voltage (or state) vectors, and AD-based frameworks allow for seamless outer-loop differentiation required for embedded optimization (Stiasny et al., 14 Jan 2026).

Generation of training data involves sampling feasible (and, for robustness, infeasible) operating conditions on benchmark networks (e.g., IEEE 9-bus), and ground-truth labels are obtained by solving the inner RPF minimization potentially with slack variable adjustment (Stiasny et al., 14 Jan 2026).

3. Physics-Informed Losses and Test-Time Adaptation

Modern neural solvers incorporate explicit physics-based loss terms to enforce AC power flow consistency and operational constraints (e.g., voltage and line-flow limits) during both training and inference. In the Test-Time Training (PI-TTT) framework (Dogoulis et al., 27 Nov 2025), a surrogate model $f_\theta$ predicts bus voltages and angles, from which residuals $\Delta P$ and $\Delta Q$ are computed against specified injections: $\Delta P_i(\hat x;z) = P^{\rm spec}_i(z) - P_i(\hat V,\hat\theta), \qquad \Delta Q_i(\hat x;z) = Q^{\rm spec}_i(z) - Q_i(\hat V,\hat\theta).$ At inference, PI-TTT refines parameters via a small number of gradient-based updates (5–20 steps; learning rate $10^{-2}$ – $10^{-3}$ ) to minimize a loss that aggregates residuals and operational constraint violations.

The total test-time loss is

$\mathcal{L}_{\rm TTT}(\phi;z) = \|\Delta P\|_2^2 + \|\Delta Q\|_2^2 + \lambda_V\,\phi_{\rm volt} + \lambda_\ell\,\phi_{\rm flow},$

with ReLU-based penalties for voltage and flow limit violations. Empirically, PI-TTT achieves 1–2 orders of magnitude reductions in power flow residuals and violations at modest computational overhead (e.g., IEEE 14-bus: forward pass 3 ms, PI-TTT refinement +14 ms vs. Newton–Raphson 19 ms) (Dogoulis et al., 27 Nov 2025).

4. Predict-then-Optimise (PO) and Modular Integration

A central architectural advance is the integration of neural solvers within Predict-then-Optimise (PO) workflows (Stiasny et al., 14 Jan 2026). Here, fast surrogates provide voltage/state predictions $v̂(u;\theta)$ for candidate controls $u$ , which are then refined or further optimized within outer-loop problems such as AC-OPF: $\min_{u_d} c(u_d) + \lambda \rho(v̂(u;\theta),u) \quad \text{s.t.} \quad g(v̂(u;\theta), u_d) \leq 0,$ where $u = [u_d; u_n]$ splits decision and non-decision controls. Gradients with respect to $u_d$ are obtained through AD, enabling fast outer-loop convergence.

Key advantages are:

Removal of rigid bus-type asymmetries.
Flexibility to switch between tasks (e.g., AC-OPF, state estimation, quasi-steady-state PF) without retraining the inner neural block.
Quantification and minimization of infeasibility via residual norms.

5. Residual Correction and Hybrid Architectures

Residual learning strategies extend neural solvers to hybrid settings, particularly the correction of linear baseline solutions (such as DC-OPF) to full nonlinear AC-feasible ones (Za'ter et al., 17 Oct 2025). The Residual Correction Model defines the AC-OPF solution as

$\widehat x_{\rm AC}(z) = x^{(0)}_{\rm AC}(z) + f_\theta(x_{\rm DC}(z)),$

where $x^{(0)}_{\rm AC}$ is constructed from the DC-OPF output, and $f_\theta$ is a topology-aware GNN mapping DC variables to necessary nonlinear corrections.

The training loss comprises supervised fit, physics-informed residual penalties (enforcing AC power-flow feasibility and operational limits), cost-optimality objectives, and regularization of residual heads. Experiments with IEEE 57-, 118-, and 2000-bus systems show MSE reductions (up to 40%), 3× lower feasibility errors, and up to 13× speedup (e.g., 2000-bus: neural inference 7 s, AC-IPOPT 93 s). Robustness to N−1 contingencies is maintained, with small error increases in zero-shot tests and full accuracy upon fine-tuning (Za'ter et al., 17 Oct 2025).

6. Quantitative Evaluation and Performance Insights

Evaluation across a range of tasks and test systems demonstrates the strengths and boundaries of current neural solvers:

Benchmark	Approach	MSE (V+P), pu	Feasibility Error	Speedup vs. AC-IPOPT
IEEE 14-bus	PowerFlowNet	0.924	—	—
IEEE 14-bus	PowerFlowNet+PI-TTT	0.047	—	∼1.3×
IEEE 57-bus	Residual-Corr. GNN	↓40%	3× lower	—
IEEE 118-bus	Residual-Corr. GNN	3.1e-4	3× lower	11×
PEGASE 1354	PowerFlowNet+PI-TTT	RMSE_P:0.72	max abs. viol. ↓	—
2000-bus	Residual-Corr. GNN	½ best base	—	13×

Physical residuals and operational constraint violations are consistently reduced by up to two orders of magnitude when employing residual-based test-time refinement or correction architectures (Dogoulis et al., 27 Nov 2025, Stiasny et al., 14 Jan 2026, Za'ter et al., 17 Oct 2025).

7. Scalability, Flexibility, and Limitations

Recent neural solver frameworks scale efficiently to networks of thousands of buses due to GNN architectures and message passing based on local topology (Za'ter et al., 17 Oct 2025). Training is achieved with moderate data requirements (e.g., 2,000 points for a 9-bus case) and standard optimizers (L-BFGS, Adam). Inference runtimes are in the 30 μs–14 ms range, offering near real-time feasibility checks and optimization.

However, on large or highly nonlinear networks, a fixed number of gradient refinement steps may be insufficient to drive residuals to zero; richer update parameterizations or more iterations may be required. Success also presupposes a high-quality initial surrogate; poor initialization can slow or impede convergence (Dogoulis et al., 27 Nov 2025). The reusability and generalization features of RPF and related formulations mitigate task inflexibility but optimality gaps may persist in domains with dynamics or topology far outside the training regimen (Stiasny et al., 14 Jan 2026).

Neural solvers represent a foundational data-driven paradigm for power-system analysis, integrating the speed of learned surrogates with physical reliability and reusability across operational and optimization tasks. The interplay of residual formulation, test-time adaptation, and modular optimization enables versatility beyond classical ML-based proxies (Stiasny et al., 14 Jan 2026, Dogoulis et al., 27 Nov 2025, Za'ter et al., 17 Oct 2025).

Markdown Upgrade to Chat

References (3)

Residual Power Flow for Neural Solvers (2026)

Test Time Training for AC Power Flow Surrogates via Physics and Operational Constraint Refinement (2025)

Residual Correction Models for AC Optimal Power Flow Using DC Optimal Power Flow Solutions (2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Neural Solvers.