Poison Distance Metric

Updated 9 September 2025

Poison distance is a metric that quantifies the separation between disruptive elements in systems, defining physical, statistical, and embedded distances.
It underpins studies in nuclear astrophysics, random graph theory, and machine learning by linking mathematical formulations with empirical detection methods.
Measuring poison distance offers actionable insights into system vulnerabilities and informs robust strategies for detection and defense.

Poison distance is a conceptual and operational metric that quantifies the separation—either physical, statistical, or embedded—between elements that disrupt the state or reliability of a scientific system. Its manifestation and utility span disparate domains, from nuclear astrophysics and percolation models to neural network security. The examination of poison distance enables rigorous characterizations of how perturbations, anomalies, or malicious patterns can propagate, be detected, or influence the system-level behavior. This article synthesizes recent research threads, highlighting mathematical formulations, detection methodologies, and practical implications for fields where the notion of "distance to disruption" is critical.

1. Fundamental Definitions and Contexts

In nucleosynthesis, poison distance is implicitly defined as the effective extent to which a species—such as primary $^{16}$ O—depletes resource substances (neutron flux) that drive desired reactions, quantified in terms of the cross-section ( $\sigma$ ) and number density ( $N$ ) product within stellar environments (Gallino et al., 2010).

In spatial random graphs, chemical distance (alternatively referred to as poison distance in this context) describes the minimal path length between nodes subject to connectivity rules, typically regulated by heavy-tailed distributions of object diameters and geometric intersection criteria (Gracar et al., 24 Mar 2025).

In machine learning security, poison distance denotes either explicit geometric distance in latent feature spaces or operational separation in terms of loss function impacts between poisoned and benign samples, exploited both by attackers and forensic defenders (Shan et al., 2021, Qi et al., 2022, Feng et al., 26 May 2025, Peinemann et al., 7 Aug 2025).

Across these contexts, poison distance acts as a proxy for the potential, detectability, or impact of system poisoning, whether through physical processes (nuclear capture, percolation) or adversarial manipulation (data poisoning, model perturbation).

2. Mathematical Formulations and Regimes

Nuclear Astrophysics

The extractable poison distance in AGB stars is related to the product $\sigma N$ for primary $^{16}$ O:

Even a species with low neutron capture cross-section may be a potent poison if $N$ is large, yielding a high $\sigma N$ , which drains neutron flux from the s-process pathway (Gallino et al., 2010).

Percolation and Random Graphs

The chemical distance between two vertices $\mathbf{x}$ and $\mathbf{y}$ in a Poisson Boolean model is given asymptotically by

$\text{dist}(\mathbf{x}, \mathbf{y}) \sim c \log\log |\mathbf{x} - \mathbf{y}|$

with

$c = \frac{2}{\log\left( \frac{\min\{d-\kappa, \kappa\}}{\alpha_\kappa - \kappa} \right)}$

where $\alpha_k$ are regularly varying diameter indices, $d$ is the dimension, and $\kappa$ is determined per the model (Gracar et al., 24 Mar 2025).

Critical conditions:

"Ultrasmall" regime ( $\text{dist} \sim c \log\log |\mathbf{x} - \mathbf{y}|$ ) if $k < \alpha_k < \min\{2k, d\}$ for some $k$ and $\alpha_k > k$ for all $k$ .
For heavier tails ( $\alpha_k \leq k$ ), the distance reduces below any constant multiple of $\log\log |\mathbf{x} - \mathbf{y}|$ .
For lighter tails ( $\alpha_k \geq \min\{2k,d\}$ for all $k$ ), distance exceeds all multiples of $\log\log |\mathbf{x} - \mathbf{y}|$ .

Machine Learning

Forensics

The poison distance in feature/parameter space is computed via

$\text{Poison Distance}(x) = \left\| \nabla_\theta \ell(\mathcal{F}(x), \text{NULL}) - \text{benign centroid} \right\|_2$

A greater separation facilitates robust clustering of poisons for traceback (Shan et al., 2021).

Proactive Detection

Confusion Training (CT) enforces and amplifies poison distance by disrupting correlations among clean samples and magnifying fitting disparity:

$\ell_{ct} = \frac{1}{\lambda} \left[ \mathcal{L}(f(\tilde{X};\theta), \tilde{Y}) + (\lambda - 1) \mathcal{L}(f(X'; \theta), Y^*) \right]$

where confusion batches have randomized labels and a large weight $\lambda$ (Qi et al., 2022).

Backdoor Injection and Feature Embedding

ShadowPrint attack minimizes poison distance among feature embeddings via clustering loss:

$L_{\text{cluster}} = \frac{1}{N^2} \sum_{i,j, i \neq j} \frac{Z_i \cdot Z_j^T}{\|Z_i\| \|Z_j\|}$

where $Z_i$ are the feature vectors of triggered samples (Feng et al., 26 May 2025).

Minimal Poison Quantity

The one-poison hypothesis establishes that, in linear regression/classification, a single carefully constructed poison sample—along a direction unused by benign data—can achieve zero backdoor error and negligible impact on benign accuracy. In effect, the poison distance in the training set is minimized to one (Peinemann et al., 7 Aug 2025).

3. Detection, Traceback, and Defensive Amplification

Forensic frameworks exploit poison distance for traceback:

Iterative clustering in gradient space leverages quantifiable poison distance to separate and prune benign clusters, isolating the minimal effective poison set. Unlearning techniques erase the influence of candidate clusters with high computational efficiency (Shan et al., 2021).
Robustness to anti-forensics (poison disguise, helper data, bifurcation) depends critically on maintaining sufficient poison distance for effective clustering.

Proactive defense amplifies poison distance to facilitate detection:

CT dynamically stretches the separation between poison and clean samples in state-of-fitting/loss space, leading to detectable gaps that adversarial samples cannot disguise under randomized relabeling and high confusion weight. This ensures high TPR/FPR trade-offs across diverse attack types and datatypes (Qi et al., 2022).

In feature-space attacks (ShadowPrint), minimized poison distance among triggered sample embeddings maximizes attack stealth and success but complicates traditional anomaly detection, suggesting advanced defenses must target deviations in embedding statistics rather than raw input patterns (Feng et al., 26 May 2025).

4. System Impact and Empirical Evaluations

Stellar Synthesis

For maximal $^{13}$ C-pocket efficiencies, neutron poison distance (extent of $^{16}$ O-induced flux depletion) significantly suppresses heavy s-element production, observable in reduced [hs/Fe] and [Pb/Fe] ratios.
Proton capture reactions ( $^{14}$ N $(n,p)^{14}$ C), despite $^{16}$ O poisoning, boost primary fluorine synthesis in Halo metallicity AGB stars, producing [F/Fe] ratios comparable to C/Fe.

Backdoor Attack Success

ShadowPrint achieves ASR $\to$ 100%, CA decay $<$ 1%, DDR averaging $<$ 5% across both clean- and dirty-label regimes at poison rates down to 0.01%, directly correlating with minimal poison distance in the feature embedding space (Feng et al., 26 May 2025).
In linear models, a solitary poison sample is experimentally validated as sufficient for backdoor activation with essentially unaltered benign error (Parkinsons regression: clean MSE = 0.165, poisoned = 0.166; backdoor MSE = 0) (Peinemann et al., 7 Aug 2025).

Random Geometric Graphs

The explicit scaling of chemical distance (poison distance) in Poisson Boolean models predicts ultrasmall network behavior, with double-logarithmic scaling for robust but non-dense parameter regimes, and critical transitions as tail indices cross boundaries (Gracar et al., 24 Mar 2025).

5. Practical Recommendations and Future Implications

In adversarial ML, maximizing poison distance (by defenders) enhances the likelihood of successful isolation via clustering or loss-based distillation. Conversely, minimizing poison distance (by attackers) in latent space heightens stealth and circumvents input-level anomaly detectors (Qi et al., 2022, Feng et al., 26 May 2025).
Empirical implementations should design detection and cleansing pipelines to:
- Use confusion batches with dynamic relabeling and large confusion weights.
- Monitor latent representations for unusual clustering or compressed distributions.
- Deploy forensic clustering algorithms in model-impact spaces, not just input space.
In physical systems, poison distance quantification provides insight into material selection and process optimization to mitigate unwanted depletion (e.g., isotopic composition in nuclear synthesis).
Statistical guarantees in minimal poison regimes (one-poison attacks) indicate a need for omnipresent verification processes, as vanishingly small poison distances can induce catastrophic behaviors if system vulnerabilities align with unused data dimensions (Peinemann et al., 7 Aug 2025).
The analytical techniques in heavy-tailed percolation and continuum random graphs may inform resilience strategies in both natural and engineered networks, where poison distance scaling emerges as a proxy for robustness and shortcut proliferation (Gracar et al., 24 Mar 2025).

6. Cross-Domain Significance and Conceptual Outlook

Poison distance—whether manifest as neutron flux depletion, chemical path length, or statistical/feature-space discrepancy—serves as a unifying metric for evaluating the resilience, vulnerability, and detectability of perturbations in complex systems. Its quantification underpins rigorous model-building, attack-detection methods, and predictive scaling laws across physical and computational domains. The systematic analysis, amplification, or minimization of poison distance thus remains central both to understanding fundamental processes and to developing robust defense and forensic tools in settings where the boundary between benign operation and disruptive poisoning is subtle but consequential.