Reconstruction of Manifold Distances from Noisy Observations

Published 17 Nov 2025 in stat.ML, cs.LG, math.DG, and math.PR | (2511.13025v1)

Abstract: We consider the problem of reconstructing the intrinsic geometry of a manifold from noisy pairwise distance observations. Specifically, let $M$ denote a diameter 1 d-dimensional manifold and $μ$ a probability measure on $M$ that is mutually absolutely continuous with the volume measure. Suppose $X_1,\dots,X_N$ are i.i.d. samples of $μ$ and we observe noisy-distance random variables $d'(X_j, X_k)$ that are related to the true geodesic distances $d(X_j,X_k)$. With mild assumptions on the distributions and independence of the noisy distances, we develop a new framework for recovering all distances between points in a sufficiently dense subsample of $M$. Our framework improves on previous work which assumed i.i.d. additive noise with known moments. Our method is based on a new way to estimate $L_2$-norms of certain expectation-functions $f_x(y)=\mathbb{E}d'(x,y)$ and use them to build robust clusters centered at points of our sample. Using a new geometric argument, we establish that, under mild geometric assumptions--bounded curvature and positive injectivity radius--these clusters allow one to recover the true distances between points in the sample up to an additive error of $O(\varepsilon \log \varepsilon^{-1})$. We develop two distinct algorithms for producing these clusters. The first achieves a sample complexity $N \asymp \varepsilon^{{-2d-2}\log(1/\varepsilon)$} and runtime $o(N^3)$. The second introduces novel geometric ideas that warrant further investigation. In the presence of missing observations, we show that a quantitative lower bound on sampling probabilities suffices to modify the cluster construction in the first algorithm and extend all recovery guarantees. Our main technical result also elucidates which properties of a manifold are necessary for the distance recovery, which suggests further extension of our techniques to a broader class of metric probability spaces.

Abstract PDF Upgrade to Chat

Summary

The paper introduces two novel algorithms that recover intrinsic manifold distances from noisy pairwise measurements without relying on strict noise moment assumptions.
It leverages empirical L2 inner products and a regularized optimization framework to achieve robust recovery under mild geometric and statistical noise conditions.
The proposed methods offer practical benefits for inverse problems in imaging and seismic applications by ensuring efficient and accurate geometric reconstruction.

Reconstruction of Manifold Distances from Noisy Observations

Problem Setting and Motivation

This work focuses on the fundamental problem of reconstructing the intrinsic geometry of metric probability spaces—specifically, Riemannian manifolds—using only noisy observations of pairwise geodesic distances between randomly sampled points. The task is highly relevant in manifold learning, inverse problems in imaging, and data-driven geometric reconstruction. The setup assumes a $d$ -dimensional manifold $M$ of diameter $1$, endowed with a probability measure $\mu$ that is mutually absolutely continuous with the volume measure. The observable data are noisy distances $d'(x_j, x_k)$ related (in a general fashion) to the true metric distances $d(x_j, x_k)$ between samples $X_1, ..., X_N \sim \mu$ .

The main challenge addressed is the recovery of all inter-point geodesic distances among a sufficiently dense subsample of $M$ , under only mild statistical and geometric constraints on the noise model and manifold regularity. Unlike prior works (notably [noisyintrinsic]), which rely on strong i.i.d. noise assumptions and access to noise moments, this framework loosens those requirements significantly while maintaining competitive sample complexity and algorithmic efficiency.

Algorithms and Statistical Framework

Intrinsic Distance Recovery

Two main algorithms are presented:

Algorithm 1: This approach structurally generalizes previous cluster-based methods by leveraging $L_2$ -norm statistics derived from the expectation function $f_x(y) = \mathbb{E} d'(x,y)$ , thereby bypassing explicit requirements on noise moment knowledge or additivity. Key innovations include estimating inner products between the expectation functions from empirical averages and using these as proximity indicators for cluster formation. Sample complexity scales as $N \asymp \varepsilon^{-2d-2}\log(1/\varepsilon)$ , and computational runtime is sub-cubic in $N$ ( $o(N^3)$ ). Under mild geometric assumptions (bounded sectional curvature, positive injectivity radius), all pairwise distances in a $\varepsilon$ -dense net are recovered up to additive error $O(\log^{-1}\varepsilon)$ .
Algorithm 2: A regularized optimization framework is introduced, employing iterative cluster selection and objective maximization involving inner products and cluster separation penalties. The algorithm adaptively constructs clusters that form an $\varepsilon$ -net of $M$ , with sample and runtime complexity controlled by explicit functionals of manifold regularity and noise parameters (see the paper for full parameter dependencies).

Statistical Guarantees

The recovery results rely on the geometric structure of the manifold (d-regularity, curvature bounds, and injectivity radius) and statistical properties of the noise (symmetry, independence, sub-Gaussianity, bilipschitz mapping of underlying distances to observed expectation). In the case of missing data, the framework provides quantitative lower bounds for sampling probability to ensure robust recovery, with additive error scaling as $O\left(\frac{\varepsilon}{r_0^2} \log^{-1}\varepsilon\right)$ .

Strong Results and Technical Contributions

The reconstruction algorithms do not require the noise to be additive or have fixed moments; instead, only sub-Gaussianity and bi-Lipschitz expectation mapping are needed.
The recovery guarantees match or surpass those in the literature, such as [noisyintrinsic], in terms of sample complexity (up to scaling and logarithmic factors), while significantly generalizing the applicable noise models.
Advanced use of empirical $L_2$ inner products and optimization is leveraged to distinguish clusters with high geometric fidelity.
The methods extend beyond manifold contexts to more general geodesic probability spaces satisfying specified volume and separation conditions.

Furthermore, the theoretical development elucidates which geometric and statistical properties are essential for successful distance recovery in abstract metric spaces, paving the way for generalization beyond Riemannian geometries.

Implications and Applications

The results have direct applicability in numerous inverse problems where internal metrics must be reconstructed from noisy observational data, such as seismic imaging, medical ultrasound, and elastography. Existing approaches often struggle with high noise and incomplete boundary measurements, a challenge directly addressed by the loosening of noise restrictions in these algorithms. Since only mild sampling and noise independence are needed, practical denoising and geometric reconstruction from corrupted travel-time or boundary data is more feasible via these methods.

On a theoretical level, the work advances the understanding of manifold learning under nontraditional noise regimes, suggesting that recovery is possible under much weaker assumptions than previously thought. It also demonstrates that effective empirical statistics, rather than full probabilistic models, may suffice for robust geometric inference. Open questions remain about optimality of sample complexity, extensions to incomplete manifolds, and applications to broader metric probability spaces.

Future Directions

Potential future research avenues identified include:

Lower bounds on sample complexity and computational efficiency.
Extending methods to partial or incomplete metric observations.
Developing adaptive multi-scale cluster refinement algorithms.
Exploring reconstruction frameworks for metric probability spaces lacking traditional manifold structure.

Conclusion

This work establishes robust statistical and algorithmic frameworks for the reconstruction of intrinsic manifold distances from noisy pairwise observations, featuring minimal assumptions on the noise and broad applicability to geodesic probability spaces with bounded regularity. The presented methods achieve sample and computational complexity on par with prior art, while admitting more general noise models and providing rigorous mathematical guarantees. Implications extend to manifold learning, inverse boundary problems, and metric space reconstruction, opening new directions for both theory and application in high-dimensional data geometry.

Markdown