Auxiliary Physics-Informed Neural Networks
- A-PINN is an advanced form of PINN that incorporates auxiliary variables or tasks to stabilize learning for high-order PDEs.
- It improves numerical conditioning by reducing the differential order and effectively mitigating gradient pathologies in complex multi-physics simulations.
- A-PINNs extend neural architectures to output both primary and auxiliary variables, resulting in enhanced computational efficiency and accuracy.
Auxiliary Physics-Informed Neural Networks (A-PINNs) are advanced variants of Physics-Informed Neural Networks (PINNs) that systematically incorporate auxiliary variables or auxiliary tasks into the neural architecture and loss formulation to improve the solution quality of partial differential equations (PDEs), especially for problems involving high-order derivatives, integral constraints, parameterized boundary conditions, coupled multi-physics, or generalized operator parameterization. A-PINNs have been deployed in diverse scientific computing contexts including structural vibration, radiative transfer, neutron star magnetospheres, and multi-task PDE solving, providing significant performance enhancements in stability, generalization, accuracy, and computational efficiency over standard PINNs (Urbán et al., 2023, Saini et al., 30 Dec 2025, Riganti et al., 2023, Yan et al., 2023).
1. Motivations and Core Principles
Standard PINNs encode PDEs into the learning objective of a neural network by penalizing deviations from the governing equations, boundary and initial conditions, and available data. However, direct enforcement of high-order derivatives via automatic differentiation (AD) in standard PINNs often results in ill-conditioned optimization, slow convergence, and gradient pathologies, particularly for stiff or high-order systems, or for PDEs with integral operators or parameterized conditions. The A-PINN paradigm addresses these deficiencies by introducing explicit auxiliary variables—either at the level of the physical solution (e.g., derivatives, integrals) or through auxiliary tasks on related problem instances—to restructure the learning problem and improve both numerical conditioning and representation power (Saini et al., 30 Dec 2025, Riganti et al., 2023, Urbán et al., 2023, Yan et al., 2023).
2. Auxiliary Variable Formulations in Differential and Integro-Differential Problems
A-PINNs systematically "lift" selected high-order derivative or integral terms in a PDE to dedicated auxiliary variables, which are then learned as additional outputs or network components. For high-order PDEs (e.g., fourth-order Euler-Bernoulli beam equation), introducing auxiliary outputs for lower-order derivatives reduces the differential order that needs to be differentiated directly with AD at each loss term. For integro-differential equations such as the radiative transfer equation (RTE) with complex integral scattering terms, each contributing integral is formulated as an auxiliary variable (often through a projection, e.g., Legendre expansion), transforming the original integro-differential equation into an augmented system of coupled differential and auxiliary constraints (Saini et al., 30 Dec 2025, Riganti et al., 2023).
This auxiliary approach yields better-conditioned loss surfaces, decomposes complex operators, and alleviates the vanishing/exploding gradient issues prevalent in deep or stiff PINN setups, while retaining the physics-informed structure of the loss.
| Application Domain | Auxiliary Variable Type | Main Benefit |
|---|---|---|
| High-order ODE/PDE (beams) | Derivatives (e.g., in beams) | Reduces differential order |
| Radiative transfer | Projected integrals (e.g., Legendre moments) | Removes quadrature error |
| Magnetosphere modeling | Param coefficients as inputs (e.g., multipole ) | Generalizes operator |
This table summarizes representative A-PINN auxiliary variable choices from the literature (Saini et al., 30 Dec 2025, Riganti et al., 2023, Urbán et al., 2023).
3. Neural Network Architectures and Training Strategies
A-PINN architectures typically extend single-output PINNs to multi-output networks, where network outputs encode both the primary solution variables and the auxiliary variables (derivatives, integrals, or task-specific outputs). Training proceeds by constructing a composite loss function, often a weighted sum of multiple terms: primary PDE residuals, auxiliary variable constraints, boundary and initial conditions, and optionally data observation terms.
Architectural specifics are adapted to the problem structure. For example:
- For the Euler-Bernoulli beam, the network takes and outputs both and (Saini et al., 30 Dec 2025).
- For radiative transfer in slab geometry, inputs are and outputs are (radiance) and multiple , each corresponding to a term in the Legendre phase function expansion (Riganti et al., 2023).
- For parameterized PDEs with variable boundary or source conditions, the input vector is augmented with auxiliary coefficients, e.g., with representing physical parameterizations (Urbán et al., 2023).
Loss terms corresponding to auxiliary outputs are imposed at dedicated collocation points, ensuring close coupling of the auxiliary constraints to the solution dynamics. Optimizer selection varies; Adam with learning-rate decay and/or L-BFGS refinements are common (Saini et al., 30 Dec 2025, Urbán et al., 2023). Weights on loss components may be manually set or adaptively tuned during training, such as “balanced adaptive optimization” to enforce dynamic range parity (Saini et al., 30 Dec 2025).
4. Auxiliary-Task Learning (ATL) and Operator Generalization
A distinct axis within the A-PINN family is the use of auxiliary-task learning (ATL) [Editor’s term; cf. (Yan et al., 2023)], where PINNs are trained on the primary PDE instance together with one or more “auxiliary” tasks that share the primary operator but differ in initial or boundary conditions. Auxiliary tasks are constructed within the same PDE family and combined via shared (and potentially private) neural representations.
ATL-A-PINN architectures include:
- Hard parameter sharing (single shared trunk, task-specific towers),
- Soft sharing (partitioned experts for common and private representations),
- Mixture-of-experts (MMoE, with gate networks),
- Progressive layered extraction (PLE, hierarchical public/private mixtures).
A key innovation is the gradient-cosine-similarity algorithm, applied to ensures that the update direction for auxiliary losses is helpful to the main task: only auxiliary gradients with positive cosine similarity to the main-task gradient are admitted (Yan et al., 2023). Quantitative results demonstrate substantial gains: up to 96.6% reduction in error over single-task PINN in some PDE benchmarks, with an average boost of 28.2% (Yan et al., 2023).
5. Quantitative Performance and Numerical Results
A-PINN approaches consistently demonstrate enhanced numerical performance over standard PINNs and classical methods across varied domains:
- For structural vibration (Euler-Bernoulli beam), A-PINN achieves a reduction in mean squared error (MSE) and converges in $30$– fewer epochs compared to PINN baselines; maximum improvements of $75$– versus PINN and versus finite difference/symplectic-ANN are reported (Saini et al., 30 Dec 2025).
- For radiative transfer, A-PINN attains average (radiative moment) errors of (Milne), $0.1$– (Henyey-Greenstein) against analytic/benchmark tables, with train times (on GPUs) of $3$–$65$ minutes depending on phase function complexity (Riganti et al., 2023).
- In neutron star magnetosphere modeling, A-PINN achieves -relative errors in poloidal flux and magnetic field components below , with a – speed-up per new boundary/source parameter tuple after training compared to classical solvers (Urbán et al., 2023).
- For auxiliary-task learning PDEs (diffusion-reaction, Burgers’, shallow water), ATL-PINNs show consistent reductions in error, smoother loss convergence, and slower deviation from analytic solutions at late times (Yan et al., 2023).
6. Generalization, Limitations, and Extensions
A-PINNs systematically enhance generalization capabilities. In operator-parameterized form, e.g., with input-augmented auxiliary coefficients (), A-PINNs can infer solutions for arbitrary combinations of boundary/source parameters within trained ranges without retraining, and demonstrate robust performance to moderate extrapolation (Urbán et al., 2023). Similarly, in ATL, training on families of auxiliary problems promotes robust feature representations.
Limitations and considerations:
- Higher computational cost per training epoch due to increased outputs and auxiliary loss terms, though forward evaluation post-training remains highly efficient (Saini et al., 30 Dec 2025, Urbán et al., 2023).
- Hyperparameter sensitivity, particularly for network size, sampling strategy, and loss weighting.
- Present studies largely focus on canonical PDE families (linear, undamped cases, simply supported BCs, 1D or slab geometry); extending to nonlinear, multi-physics, or multidimensional cases is conceptually direct but computationally more involved (Saini et al., 30 Dec 2025, Riganti et al., 2023).
- Selection of auxiliary variables and tasks is problem-specific, and there is no universal prescription yet for optimal auxiliary constructions (Yan et al., 2023).
Open directions include application to damped/nonlinear PDEs, operator learning (via DeepONet or Fourier Operators), uncertainty quantification (Bayesian/ensemble A-PINN), domain decomposition, and inverse/modular multi-physics scenarios (Saini et al., 30 Dec 2025, Riganti et al., 2023, Yan et al., 2023).
7. Applications and Impact
A-PINNs have enabled efficient and accurate solution of complex PDEs and coupled integro-differential systems in fields where traditional numerical solvers are insufficient or computationally prohibitive, notably:
- Magnetic field evolution in neutron star interiors coupled to force-free magnetospheres, where global elliptic solves are otherwise a bottleneck for long-term simulations (Urbán et al., 2023).
- Structural vibration analysis in engineering, with robust benchmarks for undamped/forced scenarios and significant improvements in predictive performance and stability (Saini et al., 30 Dec 2025).
- Radiative transfer and coupled radiation-conduction in participating media, relevant for photonics, biomedical imaging, and thermal management, capturing phenomena beyond classical conduction models without quadrature error (Riganti et al., 2023).
- Multi-task PDE learning scenarios where auxiliary tasks inform and regularize main-task solution quality across diverse physical models (Yan et al., 2023).
A-PINN frameworks thus provide a generalizable, modular methodology for advancing physics-informed machine learning in computational science and engineering.