Spread Lemma in Probabilistic Combinatorics

Updated 3 December 2025

Spread Lemma is a key concept in probabilistic combinatorics that defines R-spread distributions, ensuring any fixed subset appears with probability at most R^(-|S|).
It underpins breakthroughs in tackling the Erdős–Rado sunflower conjecture and the Kahn–Kalai conjecture by controlling threshold phenomena in random graphs.
Recent extensions leverage Bayesian inference and two-phase proofs to generalize the lemma for embedding dense, perturbed structures in modern graph theory.

The spread lemma encapsulates a central technical concept in probabilistic combinatorics and probabilistic method, formalizing the existence of a distribution over structures (subsets, subgraphs, or embeddings) such that no fixed configuration is overly likely. It underlies major advances in extremal combinatorics, notably in threshold phenomena and the avoidance of structural concentration, and has recently been given new streamlined proofs and successful generalizations to embedding problems in dense and perturbed random structures.

1. Definition and Formal Statement

Let $X$ be a finite universe of size $N$ , and let $\pi$ be a probability distribution on subsets $A \subseteq X$ . The distribution $\pi$ is called $R$ -spread if for every fixed subset $S \subseteq X$ ,

$\pi(S \subseteq A) \leq R^{-|S|}.$

When randomized embeddings are considered, a probability measure $\mu$ on injections $\varphi : X \rightarrow Y$ is called $q$ -spread if for any $s$ distinct $x_1, \ldots, x_s \in X$ and corresponding distinct $y_1, \ldots, y_s \in Y$ ,

$\mu\bigl\{\varphi: \varphi(x_i) = y_i\ \forall i \bigr\} \leq q^s.$

The classical spread lemma (Mossel et al., 2022) asserts that given a family $\mathscr{A}$ of subsets of size at most $k$ and an $R$ -spread distribution $\pi$ on $\mathscr{A}$ , then for $p \geq C \frac{\log k}{R}$ and $V \sim Q_p$ (where $Q_p$ is the $p$ -biased random subset), the probability that $V$ covers some $A \in \mathscr{A}$ is at least 0.9.

Analogous notions are developed for graph embeddings: an embedding-distribution is $r$ -spread if the probability that a prescribed mapping holds on any set of $k$ vertices is at most $r^{-k}$ (Nenadov et al., 8 Oct 2024, Bastide et al., 10 Sep 2024).

2. Historical Context and Core Applications

The spread lemma framework has driven significant advances in extremal and probabilistic combinatorics. It was central in the improved bounds of Alweiss, Lovett, Wu, and Zhang on the Erdős–Rado sunflower conjecture, and in the proof of the fractional Kahn–Kalai conjecture by Frankston, Kahn, Narayanan, and Park (Mossel et al., 2022). In modern graph theory, the spread lemma underpins robust expandability properties and resilience results (such as dense graphs embedding arbitrary bounded-degree trees and expansion properties in perturbed random graphs) (Bastide et al., 10 Sep 2024, Nenadov et al., 8 Oct 2024).

The methodology has evolved from delicate combinatorial counting and entropy bounds to Bayesian/statistical-inference-based proofs, and has inspired the "spread blow-up lemma," which couples probabilistic embedding with regularity and blow-up techniques.

3. Probabilistic Proofs and Techniques

Spread Lemma via Truncated Second Moment

The modern proof of the spread lemma is constructed within a Bayesian statistical inference framework:

Planting Trick: A set $A$ is planted from the $R$ -spread distribution $\pi$ , and independent "noise" $V \sim Q_p$ is added. Only the union $Y = A \cup V$ is observed.
Posterior Sampling: A new set $A'$ is drawn from the posterior $\pi(\cdot | Y)$ .
Coupling: These steps define a coupling $P_p$ on $(A, A', V)$ , where $A, A'$ marginally follow $\pi$ .
The key technical statement is an iterative bound using a truncated second moment calculation under the null model.

If $\delta = (pR)^{-1/3}$ , the probability that $|A \cap A'| > \delta |A|$ is sufficiently small, which is established via bounding expectations of likelihood ratios (Radon–Nikodym derivatives) and using Markov-type arguments. The proof induces a geometric decay in concentration, yielding the spread property after $O(\log k)$ iterations (Mossel et al., 2022).

Spread in Structural Embedding

For graph embeddings, the spread property is formalized either over injections or over spanning subgraphs (e.g., trees, Hamilton cycles). The embedding procedures (sequential embedding, use of auxiliary almost-complete graphs, leveraging Chernoff/McDiarmid concentration, and permutation-based randomization) ensure that the resulting measure is $O(1/n)$ -spread for host graphs on $n$ vertices (Bastide et al., 10 Sep 2024, Nenadov et al., 8 Oct 2024).

4. Extensions: Spread Blow-Up Lemma and Spanning Structures

The spread blow-up lemma extends the philosophy of the spread lemma to the simultaneous embedding of large spanning structures while maintaining spreadness (Nenadov et al., 8 Oct 2024):

Setup: Given a reduced graph $R$ with clusters, and host graph $G$ partitioned into $V_i$ , every super-regular pair ensures expansion.
Spread Embedding: The measure on embeddings $\phi : H \rightarrow G$ is $O(1/N)$ -spread, i.e., the probability of mapping any $k$ pre-specified vertices of $H$ to distinct targets in the clusters is at most $(C/N)^k$ .
Two-Phase Proof: Sequential embedding (Phase I) maintains quasirandomness via second-moment bounds; embedding of reserved vertices (Phase II) employs the Pham–Sah–Sawhney–Simkin result on spread matchings.

Applications

The spread property dovetails with the (now proved) Kahn–Kalai conjecture: for monotone families, the spread embedding ensures that exposure to randomly thinned edges or vertices (e.g., random graphs added atop deterministic hosts) achieves sharp thresholds for spanning subgraph appearance. For example, the proof of approximate thresholds for the emergence of the kth power of Hamilton cycles in perturbed random graphs relies crucially on the spread blow-up lemma (Nenadov et al., 8 Oct 2024).

5. The Spread Lemma in Random Partition Embeddings

Central to obtaining optimally spread embeddings is the construction of random partitions into constant-sized parts, each inheriting robust degree properties from the host. In the context of tree embeddings in graphs with linear minimum degree (Bastide et al., 10 Sep 2024):

The host is partitioned randomly via uniform permutations, forming bags of prescribed sizes.
With high probability, each bag and the auxiliary graph of bags retain strong minimum-degree and near-completeness properties.
The resulting partition satisfies: for any $s$ vertices and assignments to bags, the probability all are mapped as prescribed is at most $(12C/n)^s$ .
This spreadness is bootstrapped through the multi-stage embedding of the target tree, ensuring the final law is $O(1/n)$ -vertex-spread.

This construction is all-combinatorial and avoids reliance on regularity lemma machinery, employing Chernoff, McDiarmid, and permutation spread analysis.

6. Relationship to Thresholds, Robustness, and Open Directions

The spread lemma and its generalizations establish that the avoidance of concentration in random or random-like structures is structurally robust. This yields:

Threshold Sharpness: Spread measures ensure that the appearance thresholds of certain spanning structures in random or perturbed graphs match the fractional expectation thresholds up to logarithmic factors, as guaranteed by the Kahn–Kalai theorem.
Embedding Robustness: Dense graphs supporting optimally spread embeddings of bounded-degree trees remain robust under random deletion of edges or vertices.
Open Questions: Extending the spread methodology to hypergraphs and improving density thresholds to exact Dirac-type windows remain active areas, with implications for numerous embedding and resilience problems (Nenadov et al., 8 Oct 2024).

Key References:

"A second moment proof of the spread lemma" (Mossel et al., 2022)
"Spread blow-up lemma with an application to perturbed random graphs" (Nenadov et al., 8 Oct 2024)
"Random embeddings of bounded degree trees with optimal spread" (Bastide et al., 10 Sep 2024)