Schrodinger Neural Network and Uncertainty Quantification: Quantum Machine (2510.23449v1)

Published 27 Oct 2025 in cs.LG

Abstract: We introduce the Schrodinger Neural Network (SNN), a principled architecture for conditional density estimation and uncertainty quantification inspired by quantum mechanics. The SNN maps each input to a normalized wave function on the output domain and computes predictive probabilities via the Born rule. The SNN departs from standard parametric likelihood heads by learning complex coefficients of a spectral expansion (e . g ., Chebyshev polynomials) whose squared modulus yields the conditional density $p(y|x)=\left| \psi _x(y)\right| {}^2$ with analytic normalization. This representation confers three practical advantages: positivity and exact normalization by construction, native multimodality through interference among basis modes without explicit mixture bookkeeping, and yields closed-form (or efficiently computable) functionals$-$such as moments and several calibration diagnostics$-$as quadratic forms in coefficient space. We develop the statistical and computational foundations of the SNN, including (i) training by exact maximum-likelihood with unit-sphere coefficient parameterization, (ii) physics-inspired quadratic regularizers (kinetic and potential energies) motivated by uncertainty relations between localization and spectral complexity, (iii) scalable low-rank and separable extensions for multivariate outputs, (iv) operator-based extensions that represent observables, constraints, and weak labels as self-adjoint matrices acting on the amplitude space, and (v) a comprehensive framework for evaluating multimodal predictions. The SNN provides a coherent, tractable framework to elevate probabilistic prediction from point estimates to physically inspired amplitude-based distributions.

Summary

The paper introduces the Schrödinger Neural Network, a novel framework that applies quantum mechanics principles to achieve exact normalization in conditional density estimation.
The approach utilizes complex-valued spectral expansions to naturally model multimodal distributions and compute downstream statistics analytically.
The method demonstrates superior performance in recovering multimodal posteriors and offers principled capacity control via operator calculus and targeted regularization.

Schrödinger Neural Network and Uncertainty Quantification: Quantum Machine

Introduction and Motivation

The paper introduces the Schrödinger Neural Network (SNN), a conditional density estimation framework that leverages quantum mechanical principles—specifically, the representation of uncertainty via normalized wave functions and the Born rule for probability assignment. The SNN is designed to overcome limitations of conventional approaches such as mixture density networks (MDNs), normalizing flows (NFs), and energy-based models (EBMs), which often struggle with multimodality, normalization, and analytic tractability. By parameterizing the conditional law $p(y|x)$ as the squared modulus of a learned complex amplitude function $\psi_x(y)$ , the SNN achieves exact normalization, native multimodality, and analytic computation of downstream statistics.

Mathematical Formulation and Architecture

The SNN maps each input $x$ to a complex-valued amplitude $\psi_x(y)$ , represented as a finite spectral expansion in an orthonormal basis (e.g., Chebyshev polynomials):

$\psi_x(y) = \sum_{k=0}^K c_k(x) \phi_k(y)$

where $c_k(x) \in \mathbb{C}$ are complex coefficients predicted by a neural network, and $\phi_k(y)$ are basis functions. The conditional density is given by:

$p(y|x) = \frac{|\psi_x(y)|^2}{\|\mathbf{c}(x)\|_2^2}$

Normalization is enforced analytically via the $L^2$ norm of the coefficient vector, eliminating the need for numerical quadrature or partition function estimation. The network architecture typically consists of a multi-layer perceptron (MLP) with hidden layers (e.g., GELU activations), outputting both real and imaginary parts of the coefficients. For a basis order $K$ , the output layer has $2(K+1)$ units.

Training Objective

The SNN is trained by maximizing the exact conditional log-likelihood under the Born rule:

$\mathcal{L}_{\text{NLL}}(\theta) = -\sum_{i=1}^N \log |\psi_{x_i}(y_i)|^2$

with analytic normalization. Regularization is incorporated via quadratic penalties on the coefficients, including kinetic energy (spectral smoothness) and potential energy (tail mass control):

Kinetic Regularizer: $E_{\text{kin}}(x) = \mathbf{c}^H(x) K \mathbf{c}(x)$ , penalizing high-frequency content.
Potential Regularizer: $E_{\text{pot}}(x) = \mathbf{c}^H(x) M \mathbf{c}(x)$ , shaping localization.

The total objective is:

$J(\theta) = \mathcal{L}_{\text{NLL}}(\theta) + \lambda_{\text{kin}} E_{\text{kin}} + \lambda_{\text{pot}} E_{\text{pot}}$

Expressivity, Multimodality, and Complex Coefficients

The SNN's spectral representation enables native multimodality and asymmetry through interference among basis modes. Complex coefficients provide additional phase degrees of freedom, allowing fine control over the shape of $p(y|x)$ without increasing basis size. This mechanism is more efficient than explicit mixture modeling, as multimodal and skewed densities emerge from amplitude interference rather than component enumeration.

Theoretical analysis shows that the SNN is a universal approximator for conditional densities under mild regularity, with exponential convergence for smooth targets. The choice of basis order $K$ and regularization strength directly controls the trade-off between sharpness and smoothness, as dictated by uncertainty relations analogous to those in quantum mechanics.

Operator Calculus and Uncertainty Quantification

A key innovation is the operator-based extension, where observables, constraints, and weak labels are encoded as self-adjoint operators acting on the amplitude space. For any measurable function $o(y)$ , the expectation under the SNN is:

$\mathbb{E}[o(y)|x] = \langle \psi_x, \hat{o} \psi_x \rangle$

where $\hat{o}$ is the multiplication operator. This formalism enables analytic computation of moments, quantiles, credible intervals, and risk functionals as quadratic forms in coefficient space. Constraints and weak supervision are incorporated as sparse matrix operations, facilitating efficient training and calibration.

Implementation Details and Empirical Evaluation

Network Implementation

Input: $x \in \mathbb{R}^d$
MLP: 3 hidden layers, 256 units each, GELU activation
Output: $2(K+1)$ units (real and imaginary parts of $c_k(x)$ )
Normalization: Analytic projection onto the unit sphere in coefficient space
Optimizer: Adam, learning rate $10^{-3}$ , regularization coefficient $10^{-5}$ , early stopping

Example: Inverse Problem

The SNN is evaluated on canonical inverse problems with non-invertible forward maps, where the conditional law $p(t|x)$ is sharply multimodal. Empirical results demonstrate:

Stable optimization and generalization, with validation NLL minima indicating proper regularization
Accurate recovery of multimodal posterior geometry, with high-density ridges matching true branches
Superior mass allocation and modal separation compared to real-only SNNs and MDNs
Quantitative diagnostics: mode count error, location error, allocation error, entropy profiles, and JS divergence

Multivariate Extension

For $y \in \mathbb{R}^m$ , the SNN employs tensor-product bases and low-rank/separable expansions to mitigate parameter growth. Normalization and operator calculus generalize via contractions of Gram matrices, preserving analytic tractability.

Trade-offs and Model Selection

The SNN exposes explicit "dials" for capacity control: basis order, regularization strengths, and phase allowance. Increasing $K$ improves resolution but risks overfitting; strong kinetic regularization enforces smoothness but may blur genuine modes. The operator calculus enables principled multi-objective training and calibration, with diagnostics rooted in amplitude geometry.

Limitations and Future Directions

Spectral Truncation: Requires careful selection of basis order and domain mapping; poor choices induce boundary artifacts or oscillations.
High-dimensional Outputs: While separable and low-rank constructions help, very large $m$ may require tensor-network or convolutional spectral layers.
Interpretability: Complex coefficients shift interpretability to operator actions and amplitude geometry; visualization tools for phase and interference are needed.

Future research directions include adaptive basis selection, high-dimensional structure via tensor decompositions, hybrid models combining SNNs with flows or score models, Riemannian optimization on the unit sphere, generalized supervision via operator-valued measures, and standardized evaluation protocols for multimodal diagnostics.

Conclusion

The Schrödinger Neural Network provides a coherent, physically inspired framework for conditional density estimation and uncertainty quantification. By representing uncertainty as normalized wave amplitudes and leveraging the Born rule, the SNN achieves exact normalization, native multimodality, and analytic computation of downstream statistics. Its operator calculus unifies modeling, supervision, and decision-making, while spectral parameterization enables principled capacity control and diagnostics. The approach synthesizes strengths of MDNs, NFs, and EBMs, while avoiding their respective pitfalls. Limitations in spectral truncation and high-dimensional scaling motivate ongoing research in adaptive representations and scalable architectures. The SNN framework is well-positioned for deployment in risk-sensitive, scientific, and decision-theoretic applications requiring rigorous uncertainty quantification.