Depth Extrapolation: Methods & Applications

Updated 16 April 2026

Depth extrapolation is a set of techniques that generalize predictions beyond observed training depths, applicable in multi-hop reasoning, computer vision, and quantum simulation.
Methodologies such as recurrent-depth transformers, coarse-to-fine pipelines, and operator factorization enable iterative extension while managing error accumulation.
Practical applications include dense depth map completion, elastic wave propagation, and quantum circuit optimization, illustrating the cross-domain significance of these methods.

Depth extrapolation denotes a family of methodologies and theoretical constructs focused on generalizing predictions, model behaviors, or signal propagations beyond the maximal depth, complexity, or support encountered during training or observation. The term spans multiple research domains, including machine reasoning, computer vision, geophysics, and quantum simulation. Common to all contexts is the challenge of robustly extending models or physical computations into regimes of greater compositional, spatial, or temporal depth than those present during initial parameter estimation or data acquisition.

1. Formalizations and Operational Definitions

Depth extrapolation is best defined relative to a domain-specific notion of “depth,” such as reasoning hop-count, physical depth in spatial fields, execution steps in quantum circuits, or composition steps in function application. In implicit multi-hop reasoning tasks over structured data, depth extrapolation refers to the ability of a model trained on multi-step composition up to depth $k_{\text{train}}$ to generalize appropriately to tasks requiring strictly more steps $k > k_{\text{train}}$ (Kohli et al., 9 Apr 2026). In computer vision, depth extrapolation may concern completing or estimating dense depth maps in image regions that lack direct metric priors, thus extending reliable prediction into previously unobserved or sparsely observed subspaces (Wang et al., 15 May 2025, Imran et al., 2021). In computational physics, the term is used for the propagation of signals or fields to depths beyond the reach of explicit two-way numerical solvers, using operator factorization and depth-stepping techniques (Maharramov, 2012). Quantum optimization extends the notion to circuit depth or evolution time, using extrapolation to infer ground-state energies or observable expectations at infinite depth or vanishing error (Cao et al., 2021, Mohammadipour et al., 30 Jul 2025).

2. Depth Extrapolation in Implicit Reasoning and Transformers

In neural implicit reasoning, depth extrapolation is rigorously examined in “Loop, Think, & Generalize: Implicit Reasoning in Recurrent-Depth Transformers” (Kohli et al., 9 Apr 2026). Formally, for a base set of atomic (1-hop) facts $\mathcal{C}$ and the family of $k$ -hop composed facts $\mathcal{I}_k(\mathcal{C})$ , only depths up to $k_{\text{train}}$ are provided during training. Depth extrapolation is then evaluated by the model’s ability to correctly predict multi-hop facts in $\mathcal{I}_k(\mathcal{C})$ for $k > k_{\text{train}}$ .

The recurrent-depth transformer architecture applies a small transformer stack with shared parameters for $R$ iterations:

$h^{(r+1)} = f_{\theta}(h^{(r)}; m)$

where each iteration $k > k_{\text{train}}$ 0 increases the effective compositional depth. Explicit training strategies—fixed recurrence or dynamically sampled recurrence $k > k_{\text{train}}$ 1—are found to be critical for the learnable recursion depth ceiling. Empirical results show that inference-time scaling of $k > k_{\text{train}}$ 2 (providing more recurrent iterations at test time than at train time) enables significant extrapolation in reasoning depth, up to a new maximal depth $k > k_{\text{train}}$ 3.

Notably, dynamic recurrence scheduling and curriculum learning—where the model is only exposed to deeper compositions once previous shallower ones are mastered—enable the model to discover and parameterize the proper iterative computation. However, a limitation arises in “overthinking”: beyond a certain inference iteration, both the logit margin and prediction accuracy deteriorate, capping practical depth extrapolation. Adaptive halting, combining KL-divergence and entropy thresholds between recurrent steps, mitigates this collapse by terminating computation close to the point of peak confidence (Kohli et al., 9 Apr 2026).

3. Depth Extrapolation in Depth Completion and Computer Vision

In computer vision, depth extrapolation encompasses the estimation of dense metric depth in image regions lacking direct sensors or prior data. “Depth Anything with Any Prior” (Wang et al., 15 May 2025) establishes a two-stage, coarse-to-fine pipeline integrating both incomplete metric priors (e.g., sparse LiDAR) and relative depth predictions from monocular models. The pipeline first fills missing metric regions through pixel-level metric alignment—using a $k > k_{\text{train}}$ 4-nearest neighbor linear fit of the relative prediction to the sparse metric prior—and refines the preliminary estimate with a conditioned transformer network that merges RGB, aligned prior, and raw prediction. This strategy achieves zero-shot generalization for depth completion, super-resolution, and inpainting across seven datasets and nine types of prior patterns—demonstrating robust metric depth extrapolation into spatial regions or prior conditions unseen during training.

A complementary approach is the “twin-surface extrapolation” at occlusion boundaries (Imran et al., 2021). Here, rather than interpolate a single depth surface, the model predicts both foreground and background depths, using an asymmetric loss to bias each toward plausible scene hypotheses. The fusion of these surfaces uses image cues to produce accurate single-depth estimates at ambiguous pixels, preserving sharp occlusion boundaries. This twin-extrapolation framework outperforms conventional interpolation approaches, particularly where true depth discontinuities exist.

4. Operator-Theoretic and Physical Depth Extrapolation

In wave physics and geophysics, depth extrapolation refers to the numerical propagation of fields or signals deeper into a spatial medium using operator factorization techniques. Maharramov (Maharramov, 2012) introduces a computationally efficient one-way depth-extrapolation method for isotropic elastic media. The formalism factorizes the full wave operator into a product of pseudo-differential operators, yielding a system where the depth variable $k > k_{\text{train}}$ 5 is advanced by exponentiating phase-shift matrices in the spatial frequency domain. This enables explicit downward continuation of multicomponent wavefields, with computational savings of $k > k_{\text{train}}$ 6– $k > k_{\text{train}}$ 7 compared to full time-domain reverse-time migration, while retaining stability and accuracy for moderate dips and frequencies. Lateral heterogeneity is addressed by phase-shift plus interpolation (PSPI), allowing practical application to realistic seismic imaging.

5. Depth Extrapolation in Quantum Simulation and Optimization

Quantum algorithms face constraints on circuit depth and evolution time due to decoherence and hardware limitations. Extrapolation methodologies aim to recover ideal, infinite-depth (or zero-error) results from runs at feasible depths. Cao et al. (Cao et al., 2021) analyze quantum annealing (QA), variational quantum eigensolvers (VQE), and quantum imaginary time evolution (QITE). In QA, the residual energy error after time $k > k_{\text{train}}$ 8 decays as $k > k_{\text{train}}$ 9, enabling linear least-squares extrapolation vs $\mathcal{C}$ 0 to estimate the infinite-time ground-state energy. For VQE and QITE, the energy expectation as a function of variance $\mathcal{C}$ 1 follows a linear law for sufficiently small residuals, and extrapolation to $\mathcal{C}$ 2 recovers the ground-state energy.

Richardson-style step-size extrapolation is used by Mohammadipour & Li (Mohammadipour et al., 30 Jul 2025) to reduce circuit depth in Lindblad simulation (for open quantum systems). The method combines estimates from multiple larger time-steps $\mathcal{C}$ 3 using coefficients designed to cancel low-order discretization bias, leveraging analytic smoothness in $\mathcal{C}$ 4 established from backward-error expansions. With only $\mathcal{C}$ 5 extrapolation nodes, the circuit depth needed to reach error $\mathcal{C}$ 6 is reduced from $\mathcal{C}$ 7 to $\mathcal{C}$ 8, an exponential improvement in $\mathcal{C}$ 9, while maintaining sampling complexity at $k$ 0. Node selection via perturbed Chebyshev grids and bias control are key for stability and efficacy (Mohammadipour et al., 30 Jul 2025).

6. Limitations, Mechanistic Insights, and Future Directions

The efficacy of depth extrapolation is contingent upon the model’s or algorithm’s ability to meaningfully parameterize the underlying iterative, compositional, or physical process being extrapolated. In recurrent-depth transformers, depth extrapolation only occurs once the network has empirically “grokked” the monotonic composition law, as evident in rapid generalization to deeper hops after mastering a regular curriculum (Kohli et al., 9 Apr 2026). In pixel-based vision systems, performance depends on the local geometric and photometric validity of priors and the ability of architectural fusions to denoise and synthesize high-frequency details (Wang et al., 15 May 2025).

Despite robust extrapolation in many cases, all reviewed methodologies encounter a practical ceiling—typically, a regime where additional recurrence, step size reduction, or iteration triggers overfitting, overthinking, or error accumulation. Adaptive stopping heuristics and regularization are employed to balance extension versus degradation. Mechanistically, extrapolation succeeds when models internally implement the correct algorithmic or physical iterative law; failure modes often reflect a breakdown in such internalization or the compounding of approximation error.

Future research includes extensions to more heterogeneous priors, multi-surface extrapolation for thin or complex occlusions, and domain-general strategies for automating extrapolation control (e.g., robust adaptive halting, meta-learned stopping criteria, or hybrid symbolic–neural architectures).

7. Cross-Domain Table: Depth Extrapolation Paradigms

Domain	Depth Extrapolation Definition	Representative Method / Paper
Multi-hop reasoning	Generalize beyond training hop-count $k$ 1 to $k$ 2	Recurrent-depth transformers (Kohli et al., 9 Apr 2026)
Computer vision (depth maps)	Infer dense metric depth in spatial support absent during training or with new priors	Prior Depth Anything (Wang et al., 15 May 2025)
Physical wave extrapolation	Numerical propagation to greater spatial depth using operator factorization	Elastic wave depth-stepping (Maharramov, 2012)
Quantum optimization/simulation	Predict observables at infinite circuit depth or vanishing error via post-processing	QA/VQE/QITE extrapolation (Cao et al., 2021, Mohammadipour et al., 30 Jul 2025)