VINO: Variational Physics-Informed Neural Operator

Updated 23 December 2025

VINO is a framework that integrates classical variational principles with modern neural operators to solve families of PDEs without the need for labeled data.
It employs a hybrid neural-FEM discretization that enables analytic energy evaluation and robust mesh-refinement convergence analogous to high-quality finite element methods.
Empirical benchmarks demonstrate that VINO achieves lower relative errors and improved generalization over traditional methods while maintaining data efficiency.

The Variational Physics-Informed Neural Operator (VINO) is a unified framework integrating rational energy principles from classical variational calculus with modern neural operator architectures to solve families of partial differential equations (PDEs), particularly those with an energy (variational) structure. VINO encompasses several algorithmic and architectural variants, but the core philosophy is to embed the weak or energy-based formulations of PDEs deeply into the training loop of neural operator models, thereby enabling label-free learning, superior mesh-refinement behavior, and robust generalization to diverse physical regimes. This paradigm directly targets fundamental bottlenecks in conventional data-driven neural operator training and strong-form physics-informed approaches.

1. Variational Formulation and Theoretical Foundations

At the foundation of VINO are variational formulations of boundary-value PDEs. Let $\Omega\subset\mathbb{R}^d$ be a bounded domain, with $u:\Omega\to\mathbb{R}^m$ denoting the sought solution field. The strong-form PDE

$L[u](x) = 0, \qquad x\in\Omega$

with essential (Dirichlet) and natural (Neumann) boundary data is recast in terms of an energy functional

$E[u] = \int_\Omega F(x, u, \nabla u, \ldots) \, dx + \int_{\partial\Omega_N} E_b(x, u, \ldots) \, d\Gamma$

whose stationary points (vanishing first variations) recover the strong and natural boundary conditions via the Euler–Lagrange equations. Prototypically, for a self-adjoint operator with $A(x)$ positive-definite,

$E[u] = \frac{1}{2} \int_\Omega (\nabla u)^T A(x) \nabla u\, dx - \int_\Omega f(x)u(x) dx - \int_{\partial\Omega_N} h(x)u(x)\, d\Gamma$

implies $-\nabla\cdot(A\nabla u) = f$ in $\Omega$ , $(A\nabla u)\cdot n = h$ on $\partial\Omega_N$ .

The minimizer of $E[u]$ subject to essential boundary conditions $u=g$ on $\partial\Omega_D$ is precisely the PDE solution.

2. Neural Operator Framework and Hybrid Discretization

VINO learns the nonlinear solution operator mapping input fields (coefficients, sources, boundary data) to PDE solutions, $\widehat{G}_\theta:p(\cdot)\mapsto u(\cdot)$ . Architecturally, it adopts the neural operator formulation: $\widehat{G}_\theta = Q\,\circ\, (W_L + K_L) \circ \cdots \circ \sigma(W_1 + K_1) \circ P,$ where $P$ and $Q$ denote pointwise "lifting" and "projection", $W_k$ are local linear operators, and $K_k$ are nonlocal (typically Fourier or kernel) operators.

The physical domain $\Omega$ is partitioned into finite elements $\{\Omega_e\}$ , and within each element, VINO parameterizes $u(x)|_{\Omega_e}$ as

$u(x)|_{\Omega_e} \approx N(x)\,\widehat{u}_e,$

where $N(x)\in\mathbb{R}^{1\times n_n}$ are shape functions (e.g., bilinear on rectangles), and $\widehat{u}_e\in\mathbb{R}^{n_n}$ are nodal values output by the neural operator. This hybrid "neural-FEM" representation enables analytic evaluation of local energy and residuals, thereby circumventing numerical quadrature and high-order automatic differentiation.

3. Variational Losses, Training Regimes, and Optimization

Rather than minimizing discrepancies with labeled solution data, VINO optimizes the discrete energy functional over batches of input samples: $\mathcal{L}(\theta) = \frac{1}{N}\sum_{i=1}^N \sum_e \left[ \frac{1}{2} \widehat{u}_e^{(i)\, T} K^e \widehat{u}_e^{(i)} - \widehat{u}_e^{(i)\, T} M^e f_e^{(i)} \right] + \mathcal{L}_{BC}(\theta)$ with element-level stiffness ( $K^e$ ) and mass ( $M^e$ ) matrices defined as

$K^e = \int_{\Omega_e} (\nabla N)^T (\nabla N) d\Omega_e, \qquad M^e = \int_{\Omega_e} N^T N\, d\Omega_e,$

and $f_e^{(i)}$ being the projection of source terms. Dirichlet (essential) boundary conditions are imposed either via penalty in $\mathcal{L}_{BC}$ or by construction in the shape functions.

Two central optimization strategies exist in this context (Xu et al., 2023):

Direct minimization: Backpropagation is carried out through the energy or weak-form residual, analogously to the Ritz or Galerkin finite element method, but with network outputs as degrees of freedom.
Iterative inner-solver update (e.g., conjugate gradient): The neural network output provides an initial guess, refined via a few inner iterations (steepest descent or CG), which then generate pseudo-labels to further guide learning.

No paired (solution, input) data is required, and all gradients flow solely through the nodal neural operator outputs.

4. Computational Implementation and Scalability

By leveraging the analytic computation of elementwise quadratic forms, VINO avoids the computational overheads of numerical quadrature and spatial automatic differentiation. Stiffness and mass matrices depend only on element geometry and shape functions, and on uniform grids are reused across elements. The per-element cost is $O(n_n^2)$ , and total computational complexity per batch scales as $O(N_{elements})$ . This enables the use of finer meshes—improving resolution and solution accuracy with nearly linear cost scaling.

Crucially, VINO demonstrates mesh-refinement convergence analogous to high-quality finite-element methods: as mesh size increases (e.g., from $16\times 16$ up to $128\times 128$ for canonical problems), error consistently decays, in contrast to the stagnation or deterioration of strong-form PINO approaches at high resolutions (Eshaghi et al., 10 Nov 2024).

5. Benchmarking and Empirical Performance

VINO has been empirically validated on standard ODE and PDE benchmarks (Eshaghi et al., 10 Nov 2024, Xu et al., 2023), including:

Second-order ODE (anti-derivative): VINO achieves relative $L^2$ -errors of $0.36\%\pm0.36\%$ , outperforming FNO ( $0.54\%\pm0.61\%$ ) and PINO ( $1.83\%\pm0.97\%$ ).
2D Poisson equation: VINO achieves $0.75\%\pm0.33\%$ versus FNO at $0.94\%\pm0.30\%$ and PINO at $2.63\%\pm1.92\%$ .
Darcy flow: VINO achieves $0.93\%\pm0.31\%$ , reducing further with the inclusion of low-resolution solution data.

These experiments demonstrate that VINO matches the training speed of established neural operator methods (100–250 s per training regime) while offering marked reductions in error. Notably, VINO exhibits robust error decay under mesh refinement, and delivers resilience to input distribution shifts (e.g., generalization across sinusoidal, exponential, and random forcing fields) (Eshaghi et al., 10 Nov 2024).

Scaling laws indicate that with purely label-free training and a minimal set of labeled samples for distribution shift correction, test errors decrease as a power law of unlabeled data count (e.g., $E\approx 1.13 N^{-0.22}$ for Darcy flow) (Xu et al., 2023).

6. Extensions, Limitations, and Comparative Landscape

VINO's principal strengths:

Label efficiency: Effective learning without ground-truth solution data, far exceeding data-driven neural operator efficiency.
Physics integration: Variational principle enforces global physical consistency beyond pointwise PDE residuals.
Analytic loss evaluation: No high-order derivatives or quadrature; fully compatible with GPU-based batch processing.
Mesh-independence and convergence robustness: High fidelity improves automatically under mesh refinement.

Limitations and open avenues:

Applicability is currently limited to PDEs admitting a variational (energy) formulation—nonvariational and certain first-order/hyperbolic equations remain outside current scope (Eshaghi et al., 10 Nov 2024).
Domain discretization and shape-function assembly can be nontrivial, particularly for nonrectangular or unstructured meshes.
Adaptive meshing, multi-scale coupling, and rigorous error estimation frameworks are active areas for future development.

Comparative designs (e.g., data-driven FNO, strong-form PINO) require extensive labeled data or suffer from mesh-refinement failures. Variational operator learning (Xu et al., 2023) and VINO provide a framework for Ritz/Galerkin matrix-free operator learning via convolutional evaluation—offering a natural marriage between classical finite element strategies and neural network expressivity.

Bayesian and uncertainty-quantified variants of VINO have also been constructed using normalizing flows over neural functional priors (e.g., via GANs and DeepONet surrogates), demonstrating that stochastic-gradient variational inference (VI) can yield uncertainty quantification with accuracy comparable to full-batch Hamiltonian Monte Carlo, but with scalability and computational efficiency for minibatch contexts (Meng, 2023).

7. Outlook and Prospective Directions

Future research centers on expanding VINO’s applicability to broader PDE families, generalizing to weak-form (e.g., Petrov–Galerkin) formulations, accommodating unstructured and higher-order spectral elements, and integrating adaptive mesh refinement guided by physics-informed error indicators. Multi-physics and multi-scale applications are plausible as energy principles are ubiquitous across scientific and engineering domains. Pre-training libraries of neural operators for rapid, domain-specific simulation promises major reductions in computational simulation costs (Eshaghi et al., 10 Nov 2024).

VINO establishes a principled, mesh-invariant, data-efficient, and physically faithful operator-learning paradigm, merging the mathematical structure of variational calculus with the flexibility of deep neural operators to advance simulation and modeling in computational science.