Papers
Topics
Authors
Recent
Search
2000 character limit reached

First-Hitting Sampler (FHS)

Updated 22 April 2026
  • First-Hitting Sampler (FHS) is a class of probabilistic algorithms that samples the distribution of a stochastic process at the first time it reaches a specified target set.
  • It leverages explicit path decompositions, combining quasi-stationary mixing phases with geometric tails to ensure unbiased simulation in both discrete and continuous settings.
  • FHS finds practical applications in rare-event analysis and generative modeling, offering rigorous error bounds, finite computational cost, and adaptability to high-dimensional spaces.

A First-Hitting Sampler (FHS) is a class of probabilistic sampling algorithms that generate random samples from the distribution of a stochastic process at the first time it reaches a specified target set or boundary. This framework generalizes the classical idea of strong stationary times and is applied to both discrete and continuous Markov processes (including diffusion models and Markov chains), underpinning unbiased simulation in rare-event analysis, generative modeling, sequential decision problems, and more. FHS fundamentally exploits the path decomposition at the first hitting time, offering rigorous guarantees, sharp error bounds, and practical efficiency in high-dimensional and structured spaces.

1. Foundational Principles of First-Hitting Sampling

FHS leverages the explicit pathwise decomposition of Markov processes at the stopping, or hitting, time of a target set. In discrete-time Markov chains, for a set GG in state space XX, the first hitting time TGT_G is the minimum tt such that Xt∈GX_t \in G. In diffusions or continuous time Markov processes (CTMCs), the first hitting time τ\tau is defined analogously as the infimum over tt for which the process enters the absorbing boundary.

The FHS approach arises from the study of strong stationary times and their generalizations. The latter, especially Conditionally Strong Quasi-Stationary Times (CSQST), allow one to decompose the first-hitting probability law into a "mixing phase" toward a quasi-stationary regime and an independent geometric (or exponential) tail, which can be sampled exactly under explicit, verifiable conditions (e.g., ergodicity, primitivity) (Manzo et al., 2016). In continuous spaces, parametrix expansions or Doob’s hh-transform are employed to represent exit-time densities and bridge laws (Frikha et al., 2016, Ye et al., 2022).

2. FHS in Discrete Markov Chains and Quasi-Stationarity

In a discrete Markov chain (Xt)(X_t) with transition matrix PP, target/absorbing set XX0, and complement XX1, FHS exploits the quasi-stationary structure of XX2. After initial mixing to the quasi-stationary law XX3 in XX4 (CSQST XX5), the residual time until hitting XX6 is exactly geometric with parameter XX7 (where XX8 is the spectral radius of XX9). The joint law of TGT_G0 is then

TGT_G1

with TGT_G2 and TGT_G3 a computable shift. This separation underpins an exact sampling algorithm: first mix to quasi-stationarity via the minimal CSQST, then simulate a geometric tail to the first hit, sampling the exit location from TGT_G4, the quasi-stationary exit distribution (Manzo et al., 2016). All components are computable from the underlying chain's eigenstructure, with pathwise and probabilistic correctness guarantees.

3. FHS for Continuous Diffusions and Stochastic Differential Equations

For stochastic differential equations (SDEs) and continuous diffusions, the FHS method is grounded in the parametrix expansion of hitting-time densities and Doob’s TGT_G5-transform for path conditioning. Specifically, for a diffusion stopped at an absorbing boundary TGT_G6, the joint law of TGT_G7 is represented as a convergent series of density kernels (frozen bridge densities, Levy densities, and parametrix corrections), each admitting Gaussian-type bounds and analytic expressions involving Hermite polynomials (Frikha et al., 2016). The Monte Carlo FHS simulates Poisson-distributed random inter-arrival times and propagates "frozen" Euler bridges between these times, terminating upon hitting TGT_G8 and correcting for bias through analytic weights—yielding an unbiased estimator for expectations involving TGT_G9. This method is robust to regularity, supports heavy-tailed or truncated parametrix series, and achieves finite expected computational cost proportional to the mean hitting time.

4. FHS in Diffusion-Based Generative Modeling

The FHS paradigm has recently found impactful use in discrete latent diffusion models for generative modeling of symbolic data, such as text sequences and categorical images. In the masked diffusion setting, the process is described by a CTMC on tt0 with mask-absorbing rates for each coordinate. FHS here proceeds by iteratively unmasking coordinates in continuous time, exactly matching the jump-time and jump-index laws of the reverse CTMC. The sampling path is described by sequences of tt1, with randomness realized by uniform variables tt2 and coordinate selections (Liang et al., 26 Feb 2026).

Crucially, error analysis reveals that FHS’s sampling discrepancy is solely attributable to score estimation (e.g., the error in predicting conditional token probabilities by the learned network), with no discretization or surrogate-initialization error. This is in contrast to standard tt3-leaping Euler schemes, where discretization error persists even with perfect scores. The pathwise KL bound for FHS is

tt4

which is tight in an information-theoretic sense (Liang et al., 26 Feb 2026). This dimension-free, vocabulary-free guarantee is significant for high-dimensional symbolic domains.

5. Algorithmic Formulations and Implementation

Several algorithmic variants of FHS exist, corresponding to discrete chains, diffusions, and masked CTMCs. Key components include:

  • Precomputation of transition eigendata tt5 and the local separation curve tt6 (discrete setting).
  • Sampling of CSQST and geometric components for the total hitting time (discrete Markov chains).
  • Poisson grid simulation, frozen bridges, and unbiased reweighting for one-dimensional diffusions (Frikha et al., 2016).
  • Direct jump-time computations, coordinate selection, and neural score predictions for masked diffusion models (Liang et al., 26 Feb 2026).
  • Conditioning SDEs on exit locations via Doob’s tt7-transform and absorbing-surface boundary laws for FHS in generative modeling (Ye et al., 2022).

A table summarizing key FHS algorithmic elements in various settings:

Setting Path Decomposition Practical Steps
Discrete Markov chains CSQST + geometric tail Eigenproblem, separation, sampling CSQST/tail (Manzo et al., 2016)
1D Diffusions Parametrix series, Poisson grid Simulate Poisson events, Euler bridges, weight corrections (Frikha et al., 2016)
Masked Diffusion CTMCs Exact unmasking sequence Sample jump-times/indices, neural score prediction per event (Liang et al., 26 Feb 2026)

6. Convergence, Error Bounds, and Complexity

FHS often delivers strong, explicit guarantees. In masked diffusion models, the dimensionality and vocabulary size do not affect error bounds or convergence rates; the only source of statistical error is the score estimation accuracy (tt8), as opposed to schemes like Euler/Ï„-leaping that incur additional initialization and discretization errors. Moreover, the information-theoretic lower bound for FHS matches the upper bound, confirming the tightness of analysis (Liang et al., 26 Feb 2026).

For elliptic diffusions, Gaussian-type bounds and complete control over the variance of Monte Carlo weights ensure practical unbiasedness and finite computational overhead for moderate parametrix truncation or Poisson intensity parameter choice (Frikha et al., 2016). In discrete metastable regimes, the CSQST is typically much shorter than the mean hitting time, making FHS computationally near-optimal (Manzo et al., 2016).

7. Applications and Empirical Outcomes

FHS has demonstrated substantial benefits in a variety of domains:

  • In generative modeling on point clouds, graphs, and categorical images, FHS (and the associated First Hitting Diffusion Models, FHDM) achieves higher sample quality and substantially reduces the number of diffusion steps required—often by an order of magnitude—compared to fixed-time, non-adaptive schedulers (Ye et al., 2022).
  • In rare-event simulation and stochastic process analysis, FHS enables exact estimation of hitting-time distributions even in complex, non-reversible Markov chains or SDEs with general drift/diffusion structure (Manzo et al., 2016, Frikha et al., 2016).
  • In masked language modeling and large-scale symbolic data, FHS's guarantee of zero discretization error and tight convergence is particularly advantageous for scalability and sharp performance control (Liang et al., 26 Feb 2026).

A plausible implication is that future methodologies for generative modeling and rare-event simulation across a broad array of discrete and continuous domains will increasingly standardize on algorithms grounded in first-hitting path decompositions, supplanting classical fixed-time diffusion or decoupled sampling paradigms.


References:

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to First-Hitting Sampler (FHS).