Error-Conditioned Neural Solvers

Published 25 Jun 2026 in cs.LG, cs.AI, cs.CV, and math.NA | (2606.27354v1)

Abstract: Neural surrogate models offer fast approximate mappings from PDE parameters to solutions, but they typically treat solving as a purely statistical task: once trained, they struggle to correct their own constraint violations and extrapolate beyond the training distribution. Recent hybrid methods promote physical correctness by targeting the PDE residual via gradient descent or Gauss--Newton steps, but inherit the compute cost and instability of the underlying classical optimizers. We show, theoretically and empirically, that numerically minimizing the PDE residual can be an unreliable proxy for reconstruction accuracy in ill-conditioned systems, explaining why these methods often do not make accurate predictions despite achieving low residuals. We propose error-conditioned Neural Solvers (ENS), built on a different principle: rather than an optimization target, the PDE residual field is passed as a direct input to the network at each iteration, enabling it to read the spatial structure of its own errors and learn an update policy to iteratively correct its predictions. Across four PDE families, ENS attains the highest prediction accuracy in the large majority of settings, with gains reaching $10\times$ on turbulent Kolmogorov flow, while avoiding the expensive compute cost of hybrid methods. ENS's learned correction policy generalizes under distribution shift, including zero-shot parameter changes and cross-equation transfer, where its relative advantage is largest in the ill-conditioned regimes where residual minimization is least reliable. Project website: https://neuralsolver.github.io/.

Abstract PDF Upgrade to Chat

Authors (7)

Summary

The paper introduces a novel recurrent framework that inputs the PDE residual directly, enabling dynamic error correction and robust generalization.
It demonstrates up to 10× accuracy improvements on ill-conditioned PDEs compared to hybrid test-time optimization methods while remaining computationally efficient.
The method successfully integrates neural operators with residual conditioning, bypassing traditional optimization pitfalls in physics-constrained deep learning.

Error-Conditioned Neural Solvers: A Principled Approach to Physics-Constrained Deep PDE Solving

Introduction

Neural operator surrogates have been fundamental in accelerating the solution of parametric partial differential equations (PDEs), but their reliance on feed-forward maps ignores constraint violations inherent in the inference stage, limiting both accuracy and generalization beyond the training distribution. Recent "hybrid" architectures that inject PDE residuals as test-time optimization targets (e.g., PINO, DiffusionPDE, PCFM) invoke classical numerical routines, sacrificing efficiency and stability. "Error-Conditioned Neural Solvers" (ENS) (2606.27354) introduces a recurrent deep learning framework where the PDE residual field is provided at each iteration as a direct input to the network, enabling learned, dynamically-adaptive correction policies that directly tackle constraint satisfaction and generalization, particularly for ill-conditioned systems.

Methodological Framework

Unified View of Residual-Corrected Neural Solving

ENS is founded on the insight that minimizing the PDE residual is often a poor surrogate for solution accuracy in the presence of ill-conditioning, as shown both theoretically and empirically. The PDE residual is instead incorporated as a spatial input, coupled with the current prediction, allowing the corrector to learn complex nonlinear update policies. This stands in contrast to baseline hybrid approaches that use residuals as optimization targets and inherit the computational and stability pathologies of gradient or proximal-based updates.

Mathematically, ENS operates with the following iterative scheme:

$\hat{u}^{(k+1)} = \hat{u}^{(k)} + \beta\, C(\hat{u}^{(k)}, r^{(k)}; f, g)$

where $C$ is a learned corrector, $\hat{u}^{(k)}$ is the current prediction, and $r^{(k)}$ is the residual field computed from the PDE operator. The only supervision signal is reconstruction error to ground truth, with residuals never appearing in the loss.

Theoretical Analysis: Residual-Reconstruction Gap

A central proposition (Prop. 1) rigorously demonstrates that for a PDE with solution $u^*$ , minimizing the norm of the residual $|r(u)|$ does not guarantee convergence in reconstruction error, especially when the minimum singular value of the Jacobian $J_r(u^*)$ is small. This explains empirically observed failure modes where hybrid methods achieve low residuals but poor predictive accuracy in settings such as high-frequency Helmholtz or low-viscosity Navier-Stokes equations.

Architectural Instantiations

Predictor Network: Produces initial approximation (modified FNO/CNN for static PDEs; transformer-based VideoPDE for turbulent flows).
Corrector Network: Receives the current solution and its PDE residual, outputting an update; trained jointly via backpropagation based solely on reconstruction error.
Iteration: Corrector is applied recurrently, typically with a fixed number of steps during training and either early stopping or predetermined iteration count at inference.

This approach allows training the network solely on observable error (the ground-truth solution), thereby avoiding the objective-level failure of "residual-centric" optimization frameworks.

Experimental Results

Benchmarks and Regimes

ENS is evaluated across four core PDE families (linear and nonlinear Helmholtz, Poisson, Darcy, Navier-Stokes, Kolmogorov flow), spanning in-distribution prediction, parameter extrapolation (e.g., viscosity/wavenumber shifts), cross-equation transfer, and super-resolution.

Quantitative Results

Prediction Accuracy: ENS attains lowest relative $L_2$ errors across nearly all regimes. For instance, on turbulent Kolmogorov flow—an extreme ill-conditioning regime—ENS achieves up to $10\times$ lower error than hybrid and operator baselines. On the Helmholtz equation, ENS is also the only method to consistently minimize both residual and prediction error.
Hybrid Method Failure: PINO and PCFM, while reaching low residuals, yield high reconstruction errors in ill-conditioned regimes, confirming the theoretically predicted residual-reconstruction gap.
Efficiency: ENS inference times are $\mathcal{O}(0.1\text{--}0.4)$ s per sample, up to $C$ 0 faster than test-time optimization hybrids (e.g., PINO-TTOP), and competitive with data-only neural operators, while PCFM and DiffusionPDE are orders of magnitude more expensive.
Generalization: ENS generalizes robustly to distribution shifts (e.g., zero-shot parameter changes, cross-equation transfer), whereas comparator networks either diverge or plateau at high error.

Method	Low Residual	Low $C$ 1 Error	Extrapolation	Compute
FNO	✗	✗	✗	Fast
PINO-TTOP	✓	✗	✗	Slow
PCFM	✓	✗	✗	Slow
ENS	✓	✓	✓	Fast

Ablation and Analysis

Residual Conditioning: Removing the residual input (zeroed residual field) nullifies the improvement in both reconstruction and physical accuracy, confirming the gain is from explicit error reading, not additional recursion depth.
Initialization Robustness: ENS converges to the same residual floor irrespective of initialization (diverse starting points tested, spanning orders of magnitude in initial residual), in sharp contrast to the local convergence basin of Gauss-Newton/Newton-based correctors.
Backbone Expressivity: Effective correction requires sufficient backbone expressivity to represent fine features in the residual field. CNN-augmented architectures and transformers support convergence, while pure FNOs/UNets suffer.
Training Objective: Supplementing the objective with a residual loss term does not improve reconstruction further, consistent with the theoretical analysis.

Implications and Future Work

Practical Significance

ENS delivers high-fidelity, physically accurate, and fast neural PDE solvers robust to distribution shift and ill-conditioned operators—a crucial advance for large-scale scientific computing, surrogate modeling, and real-time simulation where classical solvers or hybrid iterative procedures are infeasible. Its independence from explicit test-time optimization offers fixed, predictable inference cost scalable to large domains and families of problems.

Theoretical Perspective

The persistent residual-reconstruction gap identified here reframes the foundation of physics-constrained ML for PDEs, emphasizing that literal constraint satisfaction during optimization may be inadequate—informing both the design of future neural solvers and evaluation metrics. The initialization-robust, error-adaptive policy learned by ENS is an instantiation of a more general paradigm for iterative correction in scientific ML.

Limitations and Extensions

The current instantiation is tested on 2D PDEs with known equations. Extending ENS to 3D, to settings with measurement noise or unknown/approximate governing equations, and as a plug-in for generative modeling frameworks—including the proposed generative Diffusion ENS variant—are promising directions. ENS's robustness to residual misspecification (e.g., due to discretization or partial knowledge) suggests favorable applicability in inverse problems and real-world predictive settings.

Conclusion

Error-Conditioned Neural Solvers represent a principled, empirically validated, and computationally efficient family of neural PDE solvers that utilize direct residual field conditioning to learn robust, nonlinear correction policies. The approach decisively outperforms both physics-blind operators and residual-minimizing hybrids across accuracy, generalization, and compute cost, especially in regimes where classical numerics and "optimize-then-correct" strategies provably fail.

By separating the roles of error signal and optimization objective, ENS introduces a new paradigm for the integration of physics and learning in the solution of PDEs, with direct implications for generalizability and stability of neural surrogates in scientific computation (2606.27354).

Markdown Report Issue