Score-Based Pullback Formulations

Updated 5 March 2026

Score-based pullback formulations define a Riemannian metric on data manifolds by pulling back the Euclidean metric through the score map, capturing intrinsic geometric properties.
They enable closed-form geodesic computations and Riemannian distances, facilitating effective manifold interpolation and applications like Riemannian autoencoders.
The integration with anisotropic normalizing flows and isometry regularization ensures scalability, accurate dimension estimation, and robust manifold learning.

Score-based pullback formulations constitute a scalable, data-driven approach to extracting and utilizing the Riemannian geometry of data manifolds. By integrating concepts from pullback Riemannian geometry and generative modeling—specifically, the differential structure induced by the probability density function of the data—these frameworks operationalize a metric structure via the pullback of the Euclidean metric under the score map. The construction features closed-form geodesics aligned with the data distribution, endowing the learned Riemannian manifold with interpretable charts, autoencoders with dimension estimation, and efficient integration with anisotropic normalizing flows. This geometry-centric methodology is demonstrably tractable for both geometry extraction and learning scalability, with closed-form solutions for key manifold operations, global charts, and practical error control (Diepeveen et al., 2024).

1. Score-Based Pullback Metric

Let $p(x)$ be a smooth probability density on $\mathbb{R}^n$ , with corresponding score $s(x) = \nabla_x \log p(x)$ . The score-based pullback metric endows $\mathbb{R}^n$ with a Riemannian metric $g_x$ , defined as the pullback of the standard Euclidean metric under the score map $\Phi(x) = s(x)$ . For tangent vectors $v, w \in T_x\mathbb{R}^n \cong \mathbb{R}^n$ , the local inner product becomes

$g_x(v, w) = \langle D_x s [v], D_x s [w] \rangle_{\ell^2} = (D_x s(x) v)^\top (D_x s(x) w),$

where $D_x s(x)$ is the Jacobian of the score. In matrix notation, the metric tensor is

$G(x) = (D_x s(x))^\top D_x s(x) \in \mathbb{R}^{n \times n}.$

This data-driven construction ensures that the local geometry reflects properties of the data distribution through second-order score structure.

2. Geodesics and Riemannian Distance

For general densities, the geodesic equation on $(\mathbb{R}^n, g)$ takes the form $\nabla_{\dot{\gamma}} \dot{\gamma} = 0$ , with $\nabla$ the Levi-Civita connection of $g$ , leading to a second-order ODE involving the Christoffel symbols of $G(x)$ . Crucially, if the density admits the factorization $p(x) \propto e^{-\psi(\varphi(x))}$ , with strongly convex $\psi$ and diffeomorphism $\varphi$ , geodesics and their Riemannian distances admit closed-form solutions. Specifically, the geodesic $\gamma_{x,y}(t)$ between $x$ and $y$ is

$\gamma_{x, y}(t) = \left( \varphi^{-1} \circ \nabla \psi^* \right) \left( (1-t) \nabla \psi \circ \varphi(x) + t \nabla \psi \circ \varphi(y) \right), \quad t \in [0, 1],$

where $\psi^*$ is the Fenchel conjugate of $\psi$ . In the important quadratic case $\psi(v) = \frac{1}{2} v^\top \Sigma^{-1} v$ , with $\Sigma$ symmetric positive definite, these simplify: $\gamma_{x, y}(t) = \varphi^{-1}\big((1-t)\varphi(x) + t\varphi(y)\big),$

$d_g(x, y) = \|\Sigma^{-1}(\varphi(x)-\varphi(y))\|_2.$

The linearization induced by $\Phi$ maps the density to a Gaussian-like structure, so straight-line interpolation in the feature-space is mapped back to curves in $x$ -space that follow regions of high data density.

3. Riemannian Autoencoder Construction

The framework supports global charting of the data manifold via a Riemannian autoencoder (RAE), which exploits the quadratic-pullback scenario for explicit encoding and decoding maps. Given base point $x_0 = \varphi^{-1}(0)$ and a selection of $d_\varepsilon$ principal variance directions, the encoder $E_\varepsilon: \mathbb{R}^n \rightarrow \mathbb{R}^{d_\varepsilon}$ is defined, via the Riemannian log map, as

$E_\varepsilon(x)_i = \langle \log^\varphi_{x_0}(x), v_i \rangle_g, \qquad v_i = D_0 \varphi^{-1}[e_i].$

In the quadratic case, this reduces to coordinate selection in feature-space. The decoder $D_\varepsilon: \mathbb{R}^{d_\varepsilon} \rightarrow \mathbb{R}^n$ is

$D_\varepsilon(z) = \varphi^{-1}\left( \sum_{i=1}^{d_\varepsilon} z_i e_i \right).$

This formulation admits provable reconstruction error bounds: if the neglected variance directions sum to at most $\varepsilon \sum_{i=1}^n \Sigma_{ii}$ , then

$\mathbb{E}_{x \sim p}\left[ \| D_\varepsilon(E_\varepsilon(x)) - x \|_2^2 \right] \leq C\varepsilon \sum_{i=1}^n \Sigma_{ii} + o(\varepsilon),$

with $C$ dependent on Jacobian norms and determinants of $\varphi$ , $\varphi^{-1}$ .

4. Integration with Anisotropic Normalizing Flows

Score-based pullback geometry integrates naturally with anisotropic normalizing flows (NFs), allowing learned diffeomorphisms $\varphi_\theta$ parameterize the transformation to latent Gaussian structure. The NF is trained using an objective function that includes isometry regularization: $\mathcal{L}(\phi, \theta) = \mathbb{E}_{x \sim p_{\mathrm{data}}}[-\log p_{\phi, \theta}(x)] + \lambda_{\mathrm{vol}} \mathbb{E}[ (\log|\det D_x \varphi_\theta|)^2 ] + \lambda_{\mathrm{iso}} \mathbb{E}[ \| D_x \varphi_\theta^\top D_x \varphi_\theta - I \|_F^2 ].$ Here, the isometry regularizer

$R_{\mathrm{iso}}(\theta) = \mathbb{E}\left\| D_x \varphi_\theta^\top D_x \varphi_\theta - I \right\|_F^2$

enforces approximate local orthonormality of $D_x \varphi_\theta$ , ensuring that $\varphi_\theta$ approaches a local $\ell^2$ -isometry and aligning the learned pullback metric with the score-based metric.

5. Scalability and Computational Properties

The methodology is architected for efficiency in both training and downstream geometric operations. The workflow is as follows:

Anisotropic NF $\varphi_\theta$ and covariance $\Sigma_\phi$ are trained via the loss $\mathcal{L}(\phi, \theta)$ .
The learned $\varphi$ and $\Sigma$ are fixed, and the pullback metric $g_x = (D_x(\Sigma^{-1}\varphi(x)))^\top D_x(\Sigma^{-1}\varphi(x))$ is constructed.
Geodesics and Riemannian distances are computed in closed form, bypassing ODE integration.
The Riemannian autoencoder is constructed using principal variance directions and closed-form encoding/decoding maps.

Computational costs are dominated by the Jacobian computation:

Metric evaluation: $O(n^2)$ per data point.
Geodesic interpolation: $O(n)$ with precomputed $\varphi$ , $\psi$ .
Isometry loss: $O(n^3)$ naively, reduced to $O(n)$ – $O(n^2)$ per layer with structured parameterizations.

This framework is the first scalable approach for extracting the complete geometry of the data manifold, producing geodesics that traverse data support, estimating intrinsic dimensions, and enabling interpretable manifold learning (Diepeveen et al., 2024).

6. Empirical Performance and Applications

Empirical results on diverse datasets, including image data, demonstrate that score-based pullback formulations yield high-quality geodesics restricted to the data manifold, accurate estimation of intrinsic manifold dimension, and coherent global charts. The use of isometry regularization in conjunction with anisotropic flows ensures that the learned geometry is faithful to the underlying data distribution, facilitating effective representation learning, manifold interpolation, and downstream inference tasks. The construction also provides non-asymptotic error guarantees for autoencoder reconstruction, giving rigorous performance control for practical applications (Diepeveen et al., 2024).

Markdown Report Issue Upgrade to Chat

References (1)

Score-based Pullback Riemannian Geometry: Extracting the Data Manifold Geometry using Anisotropic Flows (2024)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Score-Based Pullback Formulations.

Score-Based Pullback Formulations

1. Score-Based Pullback Metric

2. Geodesics and Riemannian Distance

3. Riemannian Autoencoder Construction

4. Integration with Anisotropic Normalizing Flows

5. Scalability and Computational Properties

6. Empirical Performance and Applications

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Score-Based Pullback Formulations

1. Score-Based Pullback Metric

2. Geodesics and Riemannian Distance

3. Riemannian Autoencoder Construction

4. Integration with Anisotropic Normalizing Flows

5. Scalability and Computational Properties

6. Empirical Performance and Applications

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research