Papers
Topics
Authors
Recent
2000 character limit reached

Singular Semi-Riemannian Geometry in DNNs

Updated 12 January 2026
  • Singular semi-Riemannian geometry is a framework that employs degenerate (positive semidefinite) metrics to reveal network invariances in DNNs.
  • It systematically pulls back a Euclidean output metric through layers to construct singular metrics whose null directions characterize equivalence classes in input space.
  • Algorithmic approaches like the SiMEC-1D scheme facilitate fast sampling and visualization of high-dimensional data manifolds, aiding adversarial vulnerability analysis.

Singular semi-Riemannian geometry in deep neural networks (DNNs) is a mathematical framework that employs degenerate Riemannian metrics—metrics that are positive semidefinite but not invertible—to analyze the geometry induced by feed-forward, convolutional, residual, and recurrent networks at both the input and hidden layers. By systematically pulling back an output-layer metric (typically Euclidean) through the entire network, one obtains a sequence of singular metrics whose null directions exactly correspond to network invariances: directions along which the network output does not change. This approach enables a rigorous investigation of fibers (preimages) of the network map, equivalence classes in input space, geometric characterization of adversarial directions, and algorithmic techniques for visualizing, exploring, and sampling high-dimensional data manifolds as transformed by modern DNNs (Benfenati et al., 2021, Benfenati et al., 2021, Benfenati et al., 2024).

1. Construction of Singular Metrics in DNNs

Given a smooth neural network map F:(M,gM)(N,gN)F: (M, g_M) \to (N, g_N), where MRmM \subset \R^m (input space) and NRnN \subset \R^n (output space), and a Riemannian metric gNg_N on NN, the pullback metric on MM is defined by

g=FgNg = F^* g_N

with coordinate representation

gij(x)=h,k=1nFhxi(x)ghkN(F(x))Fkxj(x).g_{ij}(x) = \sum_{h,k=1}^n \frac{\partial F^h}{\partial x^i}(x) \, g^N_{hk}(F(x)) \, \frac{\partial F^k}{\partial x^j}(x).

In matrix form, g(x)=JF(x)TgN(F(x))JF(x)g(x) = J_F(x)^T g_N(F(x)) J_F(x), where JF(x)J_F(x) is the Jacobian. If rankJF(x)<m\operatorname{rank} J_F(x) < m (as in typical overparameterized or compressive neural architectures), g(x)g(x) is positive semidefinite and only defines a (semi-)Riemannian structure. The kernel of g(x)g(x) consists of vectors tangent to directions in input space that are collapsed by the network to a single output value (Benfenati et al., 2021).

In multilayer networks, this construction is naturally iterated: for each layer map fi:Mi1Mif_i: M_{i-1} \to M_i, with output metric g(i)g^{(i)}, the pullback metric on Mi1M_{i-1} is

g(i1)=fig(i).g^{(i-1)} = f_i^* g^{(i)}.

For linear or piecewise linear layers (such as fully connected layers, convolutions, or blockwise-ReLU networks), the null space of g(i1)g^{(i-1)} is determined by the kernel of the layer Jacobian and can be computed explicitly (Benfenati et al., 2021, Benfenati et al., 2024).

2. Equivalence Classes and Fiber Geometry

The equivalence class of an input xMx \in M is the set

Ey={xM:F(x)=y},E_y = \{ x' \in M : F(x') = y \},

where y=F(x)y = F(x). Under regularity conditions (typically, FF is a submersion at xx), EyE_y is a smooth submanifold of MM with dimension dimMrankJF(x)\dim M - \operatorname{rank} J_F(x). The tangent space at xx to EyE_y is kerJF(x)\ker J_F(x), and such fibers correspond exactly to the null-directions of the singular metric gg (Benfenati et al., 2021).

For networks mapping RnRn1\R^n \to \R^{n-1}, generic fibers (preimages) are 1-dimensional curves. These equivalence classes organize input space into level sets of constant output and are fundamental for data augmentation, adversarial analysis, and understanding classifier boundaries.

The quotient space Q=M/Q = M/\sim, under the equivalence relation xxx \sim x' if δ(x,x)=0\delta(x, x')=0 for the pseudodistance induced by gg, inherits a unique smooth structure, and the vertical bundle VM=kerdπVM = \ker d\pi describes the family of fibers over the base QQ (Benfenati et al., 2021).

3. Algorithmic Reconstruction of Equivalence Classes

For the explicit construction of preimages and equivalence classes, two approaches are prominent:

  • Continuous formulation: Integrate the differential equation x˙(t)=V(x(t))\dot x(t) = V(x(t)), where V(x)V(x) is a smooth, unit-length vector field spanning kerJF(x)\ker J_F(x). This trajectory remains within the level set EyE_y and satisfies F(x(t))=yF(x(t)) = y.
  • Discrete polygonal (SiMEC-1D) scheme: Iteratively step along the null-eigenvector of g(xk)g(x_k):

    1. Compute g(xk)=JF(xk)TgN(F(xk))JF(xk)g(x_k) = J_F(x_k)^T g_N(F(x_k)) J_F(x_k).
    2. Extract a unit eigenvector vkv_k with zero eigenvalue (up to a numeric tolerance).
    3. Update xk+1=xk+δvkx_{k+1} = x_k + \delta v_k, ensuring orientation continuity.

Error control is achieved by monitoring the energy

E({xk})=k(vkTg(xk)vk)δ2.E(\{x_k\}) = \sum_k (v_k^T g(x_k) v_k) \delta^2.

Domain violations are handled by projection or boundary checks (Benfenati et al., 2021).

4. Geometric Insights: Invariances, Sensitivities, and Adversarial Vulnerability

Under the singular metric gg, null directions represent invariances—perturbations along these directions do not change the neural network's output. The induced pseudodistance

Pd(x,x)=infγgγ(s)(γ˙,γ˙)dsPd(x, x') = \inf_\gamma \int \sqrt{g_{\gamma(s)}(\dot{\gamma}, \dot{\gamma})} \, ds

is zero for x,xx', x in the same fiber. Directions orthogonal to kerg\ker g with large eigenvalues measure sensitivity: small perturbations here can yield significant output changes (Benfenati et al., 2021).

Adversarial examples often exploit such directions: perturbations of small Riemannian length but large effect in output space (large gradient norm directions) are candidates for adversarial attacks. The study of geodesics and the energy-minimizing paths in this singular geometry provides a coordinate-invariant framework for quantifying classifier robustness.

Diffusion processes governed by the Laplace–Beltrami operator

Δgf=1Gi(Ggijjf)\Delta_g f = \frac{1}{\sqrt{|G|}} \partial_i (\sqrt{|G|} g^{ij} \partial_j f)

describe uncertainty and model sensitivity transverse to the fiber (Benfenati et al., 2024).

5. Extensions to Non-Smooth and Structured Layers

Convolutional, residual, and recurrent architectures are incorporated by applying the pullback construction to the appropriate differentiable or piecewise differentiable maps. For ReLU and leaky-ReLU activations, Rd\R^d is partitioned into polytopal regions where the map is C1\mathcal{C}^1 and gg is well-defined; across boundaries, gg typically experiences a discontinuous change in signature or rank.

In convolutional networks, for example, the metric on the pre-flattened input space is constructed via automatic differentiation (e.g., in PyTorch), allowing explicit, data-driven computation of the singular geometry even in high dimensions (Benfenati et al., 2024).

Recurrent networks (including LSTM architectures) employ the product structure of input and hidden state spaces, and the stepwise update maps allow layerwise application of the pullback metric construction.

6. Random Walks, Sampling, and Numerical Experiments

Random walks on equivalence classes are realized by repeatedly sampling unit-norm directions in kerG(x)\ker G(x) and stepping xx+δvx \mapsto x + \delta v. For high-dimensional classes, the probability of revisiting a previous location is exponentially small in the class dimension and step size, O(δn)O(\delta^n). This provides an efficient means for exploring invariances and generating synthetic samples in the input manifold with fixed network output (Benfenati et al., 2024).

Empirical results span:

  • Fully connected networks for nonlinear regression: SiMEC-1D explores level sets to high accuracy, as shown with analytic surfaces z=ex2+y22z = e^{x^2 + y^2 - 2}.

  • Convolutional neural networks on MNIST: For 78410784 \to 10 softmax outputs, the null-class dimension is 774, and random walks yield synthetic samples classified identically.
  • Thermodynamic regression (power plant models): SiMEC walks in null directions keep output power fixed while exploring vast directions in input (environmental) variable space.

Step size selection is critical for numerical stability; larger δ\delta can induce drift or exit the intended domain (Benfenati et al., 2021, Benfenati et al., 2024).

7. Theoretical and Practical Implications

The singular semi-Riemannian framework enables:

  • Identification of geometric structures underlying network invariances and decision boundaries.
  • Quantitative and coordinate-invariant analysis of robustness and adversarial vulnerability by examining Riemannian lengths and geodesics in the null and orthogonal directions.
  • Fast sampling and data augmentation in high-dimensional null classes via random walks, with complexity orders of magnitude lower than naive mesh-grid explorations.
  • Uniform treatment of modern DNN architectures, including non-smooth activations and structured layers, by local partitioning and piecewise analysis.

The methods, grounded in differential geometry and implemented via automatic differentiation and linear algebra, provide a bridge between abstract mathematical theory and practical machine learning, with applications to generative modeling, uncertainty quantification, and classifier interpretation (Benfenati et al., 2021, Benfenati et al., 2021, Benfenati et al., 2024).

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to Singular Semi-Riemannian Geometry in DNNs.