SHARPy: SMC–NUTS for GW Inference

Updated 4 July 2026

SHARPy is a Bayesian framework for gravitational-wave inference that combines Sequential Monte Carlo with the No-U-Turn Sampler.
It leverages local posterior geometry through adaptive mass matrix estimation and customized boundary handling to navigate complex, high-dimensional spaces.
Built on JAX for GPU acceleration, SHARPy achieves rapid evidence estimation and sample recycling, significantly reducing runtime versus traditional methods.

SHARPy most directly denotes the Sequential Hamiltonian Riemann Monte Carlo Python sampler, a Bayesian inference framework for gravitational-wave parameter estimation and model comparison that combines Sequential Monte Carlo (SMC) with the No-U-Turn Sampler (NUTS), exploits local posterior geometry, and is built on JAX for GPU acceleration (Demasi et al., 5 Jan 2026). In adjacent literatures, closely related spellings such as ShaRPy and SHARP denote unrelated systems, including RGB-D hand tracking with uncertainty and several other domain-specific methods and instruments (Wirth et al., 2023). This suggests that the term is best interpreted contextually, with the exact capitalization in recent gravitational-wave inference referring to the 2026 SMC–NUTS framework.

1. Nomenclature and scope

The exact label SHARPy is introduced in gravitational-wave inference as the Sequential Hamiltonian Riemann Monte Carlo Python sampler (Demasi et al., 5 Jan 2026). A visually similar spelling, ShaRPy, stands for Shape Reconstruction and Hand Pose Estimation from RGB-D with Uncertainty in markerless clinical hand tracking (Wirth et al., 2023). The broader SHARP acronym also appears in LLM compression, continual learning, video token pruning, robotics, astronomical instrumentation, and solar data products (Wang et al., 11 Feb 2025, Gurbuz et al., 2023, Xia et al., 5 Dec 2025, Lachmansingh et al., 23 Sep 2025, Mahmoodzadeh et al., 8 Sep 2025, Bobra et al., 2014).

Name	Expansion	Domain
SHARPy	Sequential Hamiltonian Riemann Monte Carlo Python sampler	Gravitational-wave inference (Demasi et al., 5 Jan 2026)
ShaRPy	Shape Reconstruction and Hand Pose Estimation from RGB-D with Uncertainty	RGB-D hand pose and shape estimation (Wirth et al., 2023)
SHARP	SHaring Adjacent layers with Recovery Parameters	LLM inference acceleration (Wang et al., 11 Feb 2025)
SHARP	Sparsity and Hidden Activation RePlay	Continual learning (Gurbuz et al., 2023)
ShaRP	SHAllow-LayeR Pruning	Video LLM acceleration (Xia et al., 5 Dec 2025)
SHARP	Supercomputing for High-speed Avoidance and Reactive Planning	Robotics and HPC offloading (Lachmansingh et al., 23 Sep 2025)

Because the exact topic name is SHARPy, the gravitational-wave framework is the primary referent. The similarity of the spellings nevertheless creates a recurring source of confusion. A common misconception is that these names identify versions of a single software family; the cited papers instead describe independent systems with unrelated objectives, data modalities, and algorithmic foundations.

2. Bayesian problem formulation in gravitational-wave inference

SHARPy is motivated by the computational burden of gravitational-wave (GW) parameter estimation and model comparison. The target posterior is written as

$p(\boldsymbol{\theta}|d, H) = \frac{\mathcal{L}(d|\boldsymbol{\theta}, H)\,\pi(\boldsymbol{\theta}| H)}{p(d| H)},$

with Bayesian evidence

$\mathcal{Z} = p(d|H) = \int d\boldsymbol{\theta}\,\mathcal{L}(d|\boldsymbol{\theta},H)\,\pi(\boldsymbol{\theta}|H).$

For GW data under the assumption of stationary Gaussian noise, the log-likelihood is

$\log\mathcal{L}(d|\boldsymbol{\theta}) = -\frac{1}{2}\left\langle d-h(\boldsymbol{\theta}) \middle| d-h(\boldsymbol{\theta})\right\rangle,$

where

$\left\langle a|b\right\rangle = 4\mathrm{Re}\int_0^\infty \frac{a^*(f)b(f)}{S_n(f)}\,df.$

In this formulation, the evidence is not merely a normalization constant. It is described as central for model comparison, because it quantifies how well a model explains the data after integrating over all parameter values (Demasi et al., 5 Jan 2026).

The paper situates SHARPy against the practical limitations of standard GW inference workflows. The posterior is described as high-dimensional, multimodal, and constrained by sharp boundaries, while waveform evaluations are costly. Nested Sampling is characterized as robust and widely used but often requiring hours or days. SHARPy is developed to preserve the statistical rigor of likelihood-based GW inference while reducing wall-clock time dramatically (Demasi et al., 5 Jan 2026). A plausible implication is that the framework is meant not only as a faster sampler, but also as an infrastructure for evidence-sensitive analyses in which posterior estimation alone is insufficient.

3. SMC–NUTS algorithmic structure

The core construction of SHARPy is an SMC population that evolves from prior to posterior through a temperature ladder. At iteration $t$ , the intermediate distribution is

$p_t(\boldsymbol{\theta}|d) = \frac{\mathcal{L}(d|\boldsymbol{\theta})^{\beta_t}\pi(\boldsymbol{\theta})}{\mathcal{Z}_t},$

with $\beta_0=0$ at the prior and $\beta_T=1$ at the posterior (Demasi et al., 5 Jan 2026).

Each SMC iteration comprises three standard stages. In reweighting, particles receive weights based on the change in inverse temperature, and the framework monitors the effective sample size

$\mathrm{ESS}_t = \frac{\left(\sum_{i=1}^N w_t^{(i)}\right)^2}{\sum_{i=1}^N \left(w_t^{(i)}\right)^2}.$

In adaptive temperature selection, $\beta_t$ is chosen so that the ESS remains at a controlled fraction of the particle count through

$\mathcal{Z} = p(d|H) = \int d\boldsymbol{\theta}\,\mathcal{L}(d|\boldsymbol{\theta},H)\,\pi(\boldsymbol{\theta}|H).$ 0

In resampling and mutation, low-weight particles are replaced and then moved with a Markov transition kernel (Demasi et al., 5 Jan 2026).

The distinctive design choice is the mutation kernel. Rather than using a generic MCMC step, SHARPy uses NUTS, described as an adaptive version of Hamiltonian Monte Carlo. HMC augments the parameters $\mathcal{Z} = p(d|H) = \int d\boldsymbol{\theta}\,\mathcal{L}(d|\boldsymbol{\theta},H)\,\pi(\boldsymbol{\theta}|H).$ 1 with auxiliary momentum variables $\mathcal{Z} = p(d|H) = \int d\boldsymbol{\theta}\,\mathcal{L}(d|\boldsymbol{\theta},H)\,\pi(\boldsymbol{\theta}|H).$ 2, draws momenta from a Gaussian defined by a mass matrix $\mathcal{Z} = p(d|H) = \int d\boldsymbol{\theta}\,\mathcal{L}(d|\boldsymbol{\theta},H)\,\pi(\boldsymbol{\theta}|H).$ 3, and evolves the system under the Hamiltonian

$\mathcal{Z} = p(d|H) = \int d\boldsymbol{\theta}\,\mathcal{L}(d|\boldsymbol{\theta},H)\,\pi(\boldsymbol{\theta}|H).$ 4

NUTS removes the need to hand-tune trajectory length by terminating when the trajectory starts to “turn back” on itself. In the GW setting, this is important because posterior geometry is described as complex and traditional random-walk proposals are inefficient (Demasi et al., 5 Jan 2026).

Methodologically, SHARPy therefore couples two distinct strengths: SMC contributes particle parallelism and evidence estimation, while NUTS contributes efficient exploration of high-dimensional posteriors. This suggests that the framework is designed to avoid the usual trade-off between evidence-aware population methods and gradient-based local exploration.

4. Local geometry, boundary treatment, and software stack

A major feature of SHARPy is its use of local posterior geometry. The paper emphasizes that fixed-metric HMC or NUTS methods can miss local structure when correlations and curvatures vary strongly across parameter space. SHARPy therefore adopts a hybrid geometric strategy: at the beginning of each SMC iteration, and for each particle, the mass matrix $\mathcal{Z} = p(d|H) = \int d\boldsymbol{\theta}\,\mathcal{L}(d|\boldsymbol{\theta},H)\,\pi(\boldsymbol{\theta}|H).$ 5 is set to the Hessian of the posterior,

$\mathcal{Z} = p(d|H) = \int d\boldsymbol{\theta}\,\mathcal{L}(d|\boldsymbol{\theta},H)\,\pi(\boldsymbol{\theta}|H).$ 6

so that local curvature informs the momentum distribution; during the actual NUTS mutation step, $\mathcal{Z} = p(d|H) = \int d\boldsymbol{\theta}\,\mathcal{L}(d|\boldsymbol{\theta},H)\,\pi(\boldsymbol{\theta}|H).$ 7 is kept fixed, preserving separability of the Hamiltonian equations and allowing standard leapfrog integration (Demasi et al., 5 Jan 2026).

The paper characterizes this as borrowing from Riemannian ideas without incurring the full complexity of a fully position-dependent Hamiltonian integrator. In practical terms, the aim is to navigate narrow ridges, curved degeneracies, and anisotropic structure more effectively. This is especially relevant for GW inference, where posterior structure is often shaped by strong parameter correlations and physically imposed bounds (Demasi et al., 5 Jan 2026).

Boundary handling is treated explicitly. SHARPy enforces reflective boundaries for bounded variables and periodic boundaries for angular variables. The stated motivation is that parameters such as mass ratio are bounded and angles are periodic, so the sampler must respect physical constraints while continuing to move efficiently through parameter space (Demasi et al., 5 Jan 2026).

The implementation stack is also part of the method’s design. SHARPy is built entirely in JAX, uses BLACKJAX for NUTS, and obtains waveforms through ripple. JAX is used for automatic differentiation and device-agnostic compilation and vectorization, allowing the framework to run efficiently on GPUs and exploit the parallelism inherent in SMC (Demasi et al., 5 Jan 2026). A plausible implication is that the method’s performance is inseparable from this software architecture: the combination of autodiff, JIT compilation, GPU execution, and particle parallelism is presented as the route by which a traditionally sequential workload becomes highly parallel.

5. Evidence estimation, sample recycling, and reported performance

Because SMC reweights particles across intermediate temperatures, SHARPy obtains an evidence estimate through the ratio of normalizing constants,

$\mathcal{Z} = p(d|H) = \int d\boldsymbol{\theta}\,\mathcal{L}(d|\boldsymbol{\theta},H)\,\pi(\boldsymbol{\theta}|H).$ 8

Since $\mathcal{Z} = p(d|H) = \int d\boldsymbol{\theta}\,\mathcal{L}(d|\boldsymbol{\theta},H)\,\pi(\boldsymbol{\theta}|H).$ 9 for a normalized prior, the full evidence is the product of these ratios across all SMC steps (Demasi et al., 5 Jan 2026). This is central to the framework’s model-comparison role.

The paper further states that SHARPy recycles samples from all intermediate temperatures rather than discarding everything except the final iteration. It constructs the pooled distribution

$\log\mathcal{L}(d|\boldsymbol{\theta}) = -\frac{1}{2}\left\langle d-h(\boldsymbol{\theta}) \middle| d-h(\boldsymbol{\theta})\right\rangle,$ 0

and then uses rejection sampling to extract i.i.d. posterior samples. The intended effect is to reduce waste and improve effective sample output (Demasi et al., 5 Jan 2026).

Empirical results are reported for both simulated and real GW data. For 100 simulated binary black-hole signals injected into Gaussian noise, SHARPy passed a probability–probability (P–P) test, indicating statistical unbiasedness. On a single NVIDIA A100 GPU, it produced on average around 27,000 posterior samples in slightly more than 15 minutes (Demasi et al., 5 Jan 2026).

For GW150914, using an 11-dimensional aligned-spin waveform model, SHARPy was run 100 times and produced around 30,000 samples in about 10 minutes, requiring about 55 SMC iterations to evolve from $\log\mathcal{L}(d|\boldsymbol{\theta}) = -\frac{1}{2}\left\langle d-h(\boldsymbol{\theta}) \middle| d-h(\boldsymbol{\theta})\right\rangle,$ 1 to $\log\mathcal{L}(d|\boldsymbol{\theta}) = -\frac{1}{2}\left\langle d-h(\boldsymbol{\theta}) \middle| d-h(\boldsymbol{\theta})\right\rangle,$ 2 (Demasi et al., 5 Jan 2026). Posterior samples were compared with Dynesty via Bilby, and the corner plots showed close agreement in intrinsic and extrinsic parameters. The paper reports that the Jensen–Shannon divergence between marginal posteriors was for most parameters below or near the threshold for consistency, while declination, luminosity distance, and inclination showed somewhat larger divergence, attributed mainly to sharp posterior structure and boundary effects that make density estimation harder (Demasi et al., 5 Jan 2026).

Evidence estimation is also benchmarked against Nested Sampling. Across 100 independent runs, SHARPy’s evidence distribution was consistent with Dynesty at the 90% level. The Dynesty value tended to lie in the upper tail, which the authors interpret as suggesting that SHARPy may slightly underestimate evidence relative to Nested Sampling, a behavior also noted in related SMC work (Demasi et al., 5 Jan 2026). The broader claim of the paper is therefore not identity with Nested Sampling output, but consistency in posterior and evidence while reducing runtime from hours to minutes.

6. Other systems named ShaRPy or SHARP

Outside gravitational-wave inference, the closest spelling is ShaRPy, a markerless hand tracking system designed for the diagnosis or monitoring of activity in inflammatory musculoskeletal diseases. That system combines a data-driven dense correspondence predictor with traditional energy minimization, estimates both hand pose and personalized hand shape from a single consumer-level RGB-D camera, incorporates biomedical constraints into a parametric hand model, and provides segment-level uncertainty estimates when fingers are hidden or inconsistent with the observations (Wirth et al., 2023). The overlap with gravitational-wave SHARPy is nominal rather than methodological.

The acronym SHARP is also used for several unrelated constructs. In LLM deployment, it denotes SHaring Adjacent layers with Recovery Parameters, a post-training method that shares adjacent MLP layers and adds low-rank recovery parameters to reduce memory movement and inference time on mobile devices (Wang et al., 11 Feb 2025). In continual learning, it denotes Sparsity and Hidden Activation RePlay, a replay-based class-incremental learning method that combines sparse dynamic connectivity, rank-based freezing, and hidden activation replay (Gurbuz et al., 2023). In video LLMs, ShaRP denotes SHAllow-LayeR Pruning, a training-free shallow-layer token-pruning framework that combines segment-aware causal masking, positional debiasing, and token deduplication (Xia et al., 5 Dec 2025).

Further uses continue this dispersion. In robotics, SHARP denotes Supercomputing for High-speed Avoidance and Reactive Planning, an HPC-offloading architecture for millisecond-scale reactive planning with a 7-DOF manipulator (Lachmansingh et al., 23 Sep 2025). In astronomical instrumentation, SHARP is a near-infrared multi-mode spectrograph for the ELT and MORFEO MCAO system, comprising the NEXUS MOS and VESPER multi-IFU units (Mahmoodzadeh et al., 8 Sep 2025). In solar physics, SHARPs denotes Space-weather HMI Active Region Patches, an HMI vector-magnetic-field data product for active-region tracking and forecasting applications (Bobra et al., 2014).

Taken together, these uses show that SHARPy is not a uniquely identifying acronym across arXiv. In exact contemporary usage, however, the capitalization SHARPy refers specifically to the gravitational-wave SMC–NUTS framework (Demasi et al., 5 Jan 2026), whereas the clinically oriented hand-tracking system is spelled ShaRPy (Wirth et al., 2023). This distinction matters in bibliographic searches, software discovery, and cross-domain citation practice.