Surjective Pseudo-Invertible Neural Networks

Updated 20 May 2026

SPNNs are neural architectures ensuring surjectivity, where every output has at least one corresponding input with a well-defined pseudo-inverse.
They generalize the Moore–Penrose inverse to non-linear settings, facilitating tractable inversion for inference, generative modeling, and safety applications.
SPNNs integrate modular surjective blocks and non-linear back-projection methods to provide mathematically grounded solutions for complex inverse problems.

Surjective Pseudo-Invertible Neural Networks (SPNNs) are a broad class of neural architectures designed to explicitly ensure surjectivity—guaranteeing that every possible output has at least one pre-image—and to provide a canonical pseudo-inverse mapping that solves the inverse problem even in non-injective and non-linear regimes. SPNNs generalize both the Moore–Penrose pseudo-inverse for linear maps and the construction of invertible flows, extending the algebraic and geometric principles of pseudo-invertibility into deep learning. This approach enables tractable and principled inversion of arbitrary non-linear neural networks, with direct consequences for inference, generative modeling, semantic inversion, and even issues of safety and adversarial control.

1. Mathematical Foundations: Surjectivity and Pseudo-Invertibility

A surjective function $f:X\to Y$ ensures that for every $y\in Y$ , there exists at least one $x\in X$ with $f(x)=y$ . Pseudo-invertibility extends this concept: a pseudo-inverse $f^+:Y\to X$ satisfies $f(f^+(y))=y$ for all $y\in Y$ . In the linear case, the Moore–Penrose pseudo-inverse $A^{\dagger}$ provides the unique minimum-norm pre-image. For non-linear and high-dimensional neural settings, only the first two Penrose identities (reflexivity) can generally be satisfied:

$f(f^+(f(x))) = f(x)$ ,
$f^+(f(f^+(y))) = f^+(y)$ .

SPNNs are constructed to enforce these identities structurally, enabling consistent and well-defined inference for any output in the target space (Ehrlich et al., 5 Feb 2026, Jiang et al., 26 Aug 2025).

2. Bijective Completion and the Non-Linear Pseudo-Inverse

The central innovation of SPNNs is the notion of bijective completion. For a surjective $y\in Y$ 0, there exists an extended mapping $y\in Y$ 1—with $y\in Y$ 2—such that $y\in Y$ 3 is a global diffeomorphism (i.e., invertible). The natural non-linear pseudo-inverse is then defined as

$y\in Y$ 4

selecting a unique, canonical pre-image according to its minimal deviation from a reference location (typically $y\in Y$ 5) in the completed space. This construction generalizes the minimum-norm criterion of linear pseudo-inversion and provides a canonical solution even for highly non-linear, non-injective mappings (Ehrlich et al., 5 Feb 2026, Wetzel, 8 Jan 2026, Beitler et al., 2021).

3. SPNN Layer Architectures: Surjective Coupling and Explicit Pseudo-Inversion

SPNNs are built from modular surjective building blocks. A prototypical SPNN block operates as follows:

Apply an orthogonal mixing (e.g., Cayley-parametrized 1×1 convolution).
Partition the mixed input $y\in Y$ 6 into $y\in Y$ 7.
The forward surjective mapping is

$y\in Y$ 8

with $y\in Y$ 9 neural networks and $x\in X$ 0 elementwise multiplication.

The pseudo-inverse reconstructs $x\in X$ 1 using an auxiliary network $x\in X$ 2, then solves for $x\in X$ 3:

$x\in X$ 4

This guarantees that $x\in X$ 5 and $x\in X$ 6 by construction (Ehrlich et al., 5 Feb 2026). Multi-scale SPNNs stack such blocks, interleaving downsampling or dimension-reducing splits, to map high-dimensional inputs to lower-dimensional output spaces while maintaining explicit pseudo-invertibility (Beitler et al., 2021).

Affine surjective couplings, invertible flows with explicit dimension reduction via splits and residual penalties, and algorithmic inverse solvers (gradient-based or neural) are all encompassed within the SPNN framework (Ehrlich et al., 5 Feb 2026, Beitler et al., 2021, Tapson et al., 2012).

4. Inference Algorithms and Non-Linear Back-Projection

SPNNs exploit bijective completion to define Non-Linear Back-Projection (NLBP), a direct generalization of the linear null-space projection. NLBP computes, given a current estimate $x\in X$ 7 and a target $x\in X$ 8,

$x\in X$ 9

where $f(x)=y$ 0 is the completion mapping. This guarantees that $f(x)=y$ 1 and among all possible $f(x)=y$ 2 with $f(x)=y$ 3, $f(x)=y$ 4 is orthogonally closest to $f(x)=y$ 5 in the geometry induced by $f(x)=y$ 6. This approach yields tractable, deterministic inversion even in highly non-linear settings, and provides a consistent method for projecting arbitrary model outputs onto prescribed targets (Ehrlich et al., 5 Feb 2026, Wetzel, 8 Jan 2026).

At the architectural level, pseudo-inverses may be computed via:

Explicit auxiliary neural regressors for local pre-image selection (Ehrlich et al., 5 Feb 2026, Beitler et al., 2021).
Gradient-based minimization (e.g., minimizing $f(x)=y$ 7 using iterative solvers) (Jiang et al., 26 Aug 2025).
Twin neural network regression with anchor selection and adjustment prediction, especially for inverting non-injective functions on Euclidean domains (Wetzel, 8 Jan 2026).

5. SPNNs in Generative Inversion and Zero-Shot Solving

SPNNs fundamentally extend the range of inverse problem solvers in deep learning:

In the context of generative models, SPNNs provide algorithmic tools for requesting any desired output and solving for an input yielding that output. For instance, deterministic diffusion models, GPT-style Transformers, and LeakyReLU MLPs are almost always surjective, ensuring the existence of such inverse mappings (Jiang et al., 26 Aug 2025).
Zero-shot inversion of complex non-linear degradations—including optical, compression, or semantic (classification) operators—can be performed by integrating SPNN-defined NLBP within a generative prior’s (e.g., DDPM) sampling loop. This methodology enables range-consistent and null-space-preserving guidance to arbitrary semantic targets without retraining the generator (Ehrlich et al., 5 Feb 2026).
Attribute- or multi-attribute-constrained image reconstruction and editing are enabled via SPNN pseudo-inverse projection onto target feature subspaces, as demonstrated for CelebA-HQ face attribute inversion and attribute-controlled generation (Ehrlich et al., 5 Feb 2026).

6. Theoretical Guarantees and Functional Analysis Perspectives

SPNNs leverage results from nonlinear functional analysis, Fredholm theory, and degree-theoretic fixed-point arguments:

For infinite-dimensional operator learning, surjectivity can be enforced via coercivity and compactness (Leray–Schauder degree theory), while injectivity is achieved by structurally bijective or direct-sum-preserving layers (Furuya et al., 2023).
In finite-dimensional networks, surjectivity is a generic property for networks using Pre-LayerNorm residual blocks, LeakyReLU-MLPs, and certain linear-attention architectures, provided that exceptional parameter sets have measure zero (Jiang et al., 26 Aug 2025).
Pseudo-inverses can be constructed in the presence of nontrivial kernel or image structure, generalizing Moore–Penrose theory by selecting canonical pre-images via completion criteria or partition-of-unity-inverted blocks (Furuya et al., 2023).

A comparison of SPNN construction methods:

Method/Class	Key Surjectivity Mechanism	Pseudo-Inversion Strategy
Bijective completion (Ehrlich et al., 5 Feb 2026)	Explicit diffeomorphic lift	Nearest-completion minimization
Anchor-based TNNR (Wetzel, 8 Jan 2026)	k-NN anchor coverage in output	Twin network local adjustment regression
Affine-coupling SPNNs (Beitler et al., 2021)	Surjective splitting with penalty	Neural regression, tractable inversion
Random projection ELMs (Tapson et al., 2012)	High-dimensional surjective exp.	Closed-form or Greville incremental PInv

7. Implications, Limitations, and Safety Considerations

SPNNs, by construction, guarantee that every output is "reachable": for any desired $f(x)=y$ 8, a pre-image $f(x)=y$ 9 can be algorithmically produced. This surjectivity introduces inherent vulnerabilities:

Safety and Jailbreak Risk: Any output, including harmful or undesired content, is in principle attainable by finding the corresponding SPNN pseudo-inverse input. This has been demonstrated for both GPT-style and diffusion models (Jiang et al., 26 Aug 2025).
Robotic Control: Surjective policy networks permit adversarial trajectories to be constructed via pseudo-inverse sensor manipulation, raising safety-critical concerns in real-world deployments (Jiang et al., 26 Aug 2025).
Defensive Measures: Mitigating this existential attack surface requires either architectural modifications to break global surjectivity or post-hoc output filtering, neither of which is achievable by re-training alone (Jiang et al., 26 Aug 2025).

Open limitations and challenges include:

Numerical instability of global pseudo-inverses in high-dimensional or infinite-dimensional regimes due to kernel and singular value structure (Furuya et al., 2023).
Managing discretization and finite-rank approximation trade-offs while preserving surjectivity and injectivity (Furuya et al., 2023).
Efficiently training and integrating auxiliary pseudo-inverse networks with strong generalization properties across the SPNN’s range (Ehrlich et al., 5 Feb 2026).

Surjective Pseudo-Invertible Neural Networks thus establish a mathematically rigorous, algorithmically tractable, and practically impactful paradigm for addressing non-linear and non-injective inversion in deep learning, while also foregrounding critical safety and adversarial challenges in current and future generative models.