Effect of incident subgraph sampling on separability of heavy-tailed alternatives from power laws

Determine how incident subgraph sampling—i.e., edge-induced subgraph sampling where each edge is independently retained with probability π and its endpoints included—affects the separability of heavy-tailed distributions such as lognormal and stretched exponential distributions from power-law distributions, with the goal of understanding subsampling-induced biases in distinguishing these distribution families.

Background

The paper studies how subsampling impacts the identification of power laws versus other heavy-tailed distributions. It focuses on incident subgraph sampling, a scheme common in network science and biodiversity contexts, where each edge (or constituent part) is sampled with probability π, leading to binomial thinning of degrees. This sampling alters the observed distribution, potentially complicating classification.

Prior work has shown tails may remain asymptotically similar for power laws under subsampling, but the form at small degrees changes, making separability from heavy-tailed alternatives (lognormal, stretched exponential) a key issue. The authors highlight that, at the outset, the impact of incident subgraph sampling on such separability was not known.

References

However, we do not currently know how incident subgraph sampling affects the separability of other heavy-tailed distributions from power-law distributions.

Distinguishing subsampled power laws from other heavy-tailed distributions (2404.09614 - Sormunen et al., 15 Apr 2024) in Section 1 Introduction