Singular Semi-Riemannian Geometry in DNNs

Updated 12 January 2026

Singular semi-Riemannian geometry is a framework that employs degenerate (positive semidefinite) metrics to reveal network invariances in DNNs.
It systematically pulls back a Euclidean output metric through layers to construct singular metrics whose null directions characterize equivalence classes in input space.
Algorithmic approaches like the SiMEC-1D scheme facilitate fast sampling and visualization of high-dimensional data manifolds, aiding adversarial vulnerability analysis.

Singular semi-Riemannian geometry in deep neural networks (DNNs) is a mathematical framework that employs degenerate Riemannian metrics—metrics that are positive semidefinite but not invertible—to analyze the geometry induced by feed-forward, convolutional, residual, and recurrent networks at both the input and hidden layers. By systematically pulling back an output-layer metric (typically Euclidean) through the entire network, one obtains a sequence of singular metrics whose null directions exactly correspond to network invariances: directions along which the network output does not change. This approach enables a rigorous investigation of fibers (preimages) of the network map, equivalence classes in input space, geometric characterization of adversarial directions, and algorithmic techniques for visualizing, exploring, and sampling high-dimensional data manifolds as transformed by modern DNNs (Benfenati et al., 2021, Benfenati et al., 2021, Benfenati et al., 2024).

1. Construction of Singular Metrics in DNNs

Given a smooth neural network map $F: (M, g_M) \to (N, g_N)$ , where $M \subset \R^m$ (input space) and $N \subset \R^n$ (output space), and a Riemannian metric $g_N$ on $N$ , the pullback metric on $M$ is defined by

$g = F^* g_N$

with coordinate representation

$g_{ij}(x) = \sum_{h,k=1}^n \frac{\partial F^h}{\partial x^i}(x) \, g^N_{hk}(F(x)) \, \frac{\partial F^k}{\partial x^j}(x).$

In matrix form, $g(x) = J_F(x)^T g_N(F(x)) J_F(x)$ , where $J_F(x)$ is the Jacobian. If $\operatorname{rank} J_F(x) < m$ (as in typical overparameterized or compressive neural architectures), $g(x)$ is positive semidefinite and only defines a (semi-)Riemannian structure. The kernel of $g(x)$ consists of vectors tangent to directions in input space that are collapsed by the network to a single output value (Benfenati et al., 2021).

In multilayer networks, this construction is naturally iterated: for each layer map $f_i: M_{i-1} \to M_i$ , with output metric $g^{(i)}$ , the pullback metric on $M_{i-1}$ is

$g^{(i-1)} = f_i^* g^{(i)}.$

For linear or piecewise linear layers (such as fully connected layers, convolutions, or blockwise-ReLU networks), the null space of $g^{(i-1)}$ is determined by the kernel of the layer Jacobian and can be computed explicitly (Benfenati et al., 2021, Benfenati et al., 2024).

2. Equivalence Classes and Fiber Geometry

The equivalence class of an input $x \in M$ is the set

$E_y = \{ x' \in M : F(x') = y \},$

where $y = F(x)$ . Under regularity conditions (typically, $F$ is a submersion at $x$ ), $E_y$ is a smooth submanifold of $M$ with dimension $\dim M - \operatorname{rank} J_F(x)$ . The tangent space at $x$ to $E_y$ is $\ker J_F(x)$ , and such fibers correspond exactly to the null-directions of the singular metric $g$ (Benfenati et al., 2021).

For networks mapping $\R^n \to \R^{n-1}$ , generic fibers (preimages) are 1-dimensional curves. These equivalence classes organize input space into level sets of constant output and are fundamental for data augmentation, adversarial analysis, and understanding classifier boundaries.

The quotient space $Q = M/\sim$ , under the equivalence relation $x \sim x'$ if $\delta(x, x')=0$ for the pseudodistance induced by $g$ , inherits a unique smooth structure, and the vertical bundle $VM = \ker d\pi$ describes the family of fibers over the base $Q$ (Benfenati et al., 2021).

3. Algorithmic Reconstruction of Equivalence Classes

For the explicit construction of preimages and equivalence classes, two approaches are prominent:

Continuous formulation: Integrate the differential equation $\dot x(t) = V(x(t))$ , where $V(x)$ is a smooth, unit-length vector field spanning $\ker J_F(x)$ . This trajectory remains within the level set $E_y$ and satisfies $F(x(t)) = y$ .
Discrete polygonal (SiMEC-1D) scheme: Iteratively step along the null-eigenvector of $g(x_k)$ $g (x_{k})$ :
1. Compute $g(x_k) = J_F(x_k)^T g_N(F(x_k)) J_F(x_k)$ .
2. Extract a unit eigenvector $v_k$ with zero eigenvalue (up to a numeric tolerance).
3. Update $x_{k+1} = x_k + \delta v_k$ , ensuring orientation continuity.

Error control is achieved by monitoring the energy

$E(\{x_k\}) = \sum_k (v_k^T g(x_k) v_k) \delta^2.$

Domain violations are handled by projection or boundary checks (Benfenati et al., 2021).

4. Geometric Insights: Invariances, Sensitivities, and Adversarial Vulnerability

Under the singular metric $g$ , null directions represent invariances—perturbations along these directions do not change the neural network's output. The induced pseudodistance

$Pd(x, x') = \inf_\gamma \int \sqrt{g_{\gamma(s)}(\dot{\gamma}, \dot{\gamma})} \, ds$

is zero for $x', x$ in the same fiber. Directions orthogonal to $\ker g$ with large eigenvalues measure sensitivity: small perturbations here can yield significant output changes (Benfenati et al., 2021).

Adversarial examples often exploit such directions: perturbations of small Riemannian length but large effect in output space (large gradient norm directions) are candidates for adversarial attacks. The study of geodesics and the energy-minimizing paths in this singular geometry provides a coordinate-invariant framework for quantifying classifier robustness.

Diffusion processes governed by the Laplace–Beltrami operator

$\Delta_g f = \frac{1}{\sqrt{|G|}} \partial_i (\sqrt{|G|} g^{ij} \partial_j f)$

describe uncertainty and model sensitivity transverse to the fiber (Benfenati et al., 2024).

5. Extensions to Non-Smooth and Structured Layers

Convolutional, residual, and recurrent architectures are incorporated by applying the pullback construction to the appropriate differentiable or piecewise differentiable maps. For ReLU and leaky-ReLU activations, $\R^d$ is partitioned into polytopal regions where the map is $\mathcal{C}^1$ and $g$ is well-defined; across boundaries, $g$ typically experiences a discontinuous change in signature or rank.

In convolutional networks, for example, the metric on the pre-flattened input space is constructed via automatic differentiation (e.g., in PyTorch), allowing explicit, data-driven computation of the singular geometry even in high dimensions (Benfenati et al., 2024).

Recurrent networks (including LSTM architectures) employ the product structure of input and hidden state spaces, and the stepwise update maps allow layerwise application of the pullback metric construction.

6. Random Walks, Sampling, and Numerical Experiments

Random walks on equivalence classes are realized by repeatedly sampling unit-norm directions in $\ker G(x)$ and stepping $x \mapsto x + \delta v$ . For high-dimensional classes, the probability of revisiting a previous location is exponentially small in the class dimension and step size, $O(\delta^n)$ . This provides an efficient means for exploring invariances and generating synthetic samples in the input manifold with fixed network output (Benfenati et al., 2024).

Empirical results span:

Fully connected networks for nonlinear regression: SiMEC-1D explores level sets to high accuracy, as shown with analytic surfaces $z = e^{x^2 + y^2 - 2}$ .
Convolutional neural networks on MNIST: For $784 \to 10$ softmax outputs, the null-class dimension is 774, and random walks yield synthetic samples classified identically.
Thermodynamic regression (power plant models): SiMEC walks in null directions keep output power fixed while exploring vast directions in input (environmental) variable space.

Step size selection is critical for numerical stability; larger $\delta$ can induce drift or exit the intended domain (Benfenati et al., 2021, Benfenati et al., 2024).

7. Theoretical and Practical Implications

The singular semi-Riemannian framework enables:

Identification of geometric structures underlying network invariances and decision boundaries.
Quantitative and coordinate-invariant analysis of robustness and adversarial vulnerability by examining Riemannian lengths and geodesics in the null and orthogonal directions.
Fast sampling and data augmentation in high-dimensional null classes via random walks, with complexity orders of magnitude lower than naive mesh-grid explorations.
Uniform treatment of modern DNN architectures, including non-smooth activations and structured layers, by local partitioning and piecewise analysis.

The methods, grounded in differential geometry and implemented via automatic differentiation and linear algebra, provide a bridge between abstract mathematical theory and practical machine learning, with applications to generative modeling, uncertainty quantification, and classifier interpretation (Benfenati et al., 2021, Benfenati et al., 2021, Benfenati et al., 2024).