- The paper establishes a Gaussianization theorem that replaces dependent entries with independent Gaussian variables, ensuring convergence of the empirical spectral distribution.
- It derives a non-universal semicircle law whose variance is a convex combination explicitly dependent on hyperedge sizes and inclusion probabilities.
- The analysis delineates dominant and balanced regimes, offering insights for spectral predictions in complex higher-order network models.
Introduction and Motivation
The analysis of random hypergraphs—generalizations of graphs in which edges can connect more than two vertices—has become increasingly significant for modeling complex systems with higher-order interactions. Classic random graph models, particularly the Erdős–Rényi (ER) random graph, have well-characterized spectral and phase transition properties. However, many real-world networks are naturally non-uniform: multi-body interactions of varying arities coexist, and the probabilities for different edge sizes can be distinct and scale with the system size. This work presents a comprehensive investigation into the spectral properties of a highly general class of non-uniform, inhomogeneous ER hypergraphs, focusing on the limiting spectral distribution (LSD) of their empirical adjacency matrices.
Model Specification
The studied object is the (n,r,p)-Erdős–Rényi hypergraph, where n is the number of vertices, r=(r1,...,rk) the tuple of possible hyperedge sizes, and p=(p1,...,pk) the corresponding inclusion probabilities (potentially depending on n). For each possible hyperedge with size ri, its presence is determined independently with probability pi, yielding a hypergraph that is both non-uniform and inhomogeneous.
The adjacency matrix A is defined such that Auv (for u=v) counts the number of hyperedges (of all permitted sizes) containing both n0 and n1. The eigenvalue distribution of a suitably normalized and centered version of n2 is the main focus.
Main Results
Gaussianization and Sufficient Conditions
The first technical challenge is the dependency structure of the matrix entries for non-uniform hypergraphs, in contrast to the independent entries in Wigner matrices or classical ER graphs. The paper establishes a Gaussianization theorem: under a Pastur-type condition (in the sense of Chatterjee, 2005), the random vector of matrix entries can be replaced by independent Gaussian entries with matching mean and variance, without affecting the limiting spectral distribution. The Pastur-type condition, which is a moment tail bound, ensures replaceability via a Lindeberg replacement approach adapted for matrices with correlated entries.
A non-sparsity condition, phrased succinctly in terms of the generalized average degrees n3 for each class, is shown to guarantee this Pastur-type condition, and thus aids practical model verification.
Formally, the non-sparsity condition demands (for fixed n4, n5):
n6
with weights n7 defined as normalized contributions of each edge size to the overall variance.
Limiting Spectral Distribution and Combined Variance
Given the Gaussianization, the LSD is established as a semicircle law but, crucially, the variance is not universal; it depends explicitly on the vector of edge sizes and probabilities:
n8
where n9 and r=(r1,...,rk)0 are limiting normalized variances contributed by each edge size. This generalizes prior results for uniform hypergraphs and standard ER graphs.
Key implications:
- The variance is explicitly a convex combination of variances from the uniform cases, weighted by contributions of each edge class.
- In non-uniform models, the semicircle variance can be strictly less than 1 even if the entrywise second moment converges to 1, due to the emergence of outliers (as in BBP-type phenomena).
- Balanced or dominant regimes can be tuned by adjusting the scaling of the r=(r1,...,rk)1's: a single edge-size class can dominate, or multiple classes may jointly influence the LSD.
Parameter Regimes and Examples
The work carefully analyzes the “dominant” vs. “balanced” scenarios—e.g., when two classes with r=(r1,...,rk)2 compete, whether one class wins in the limit depends sharply on the scaling of their average degrees.
Specific corollaries address:
- Hyperedges with sizes scaling linearly with r=(r1,...,rk)3: Sufficiently large (exponential) probabilities are required for non-trivial LSDs.
- Fixed edge sizes: For classical graph superpositions (r=(r1,...,rk)4), the standard semicircle law is recovered under mild density conditions.
The regime in which both classes are sparse is not covered and left open for future investigation, as the analysis in sparse regimes requires tools like local weak convergence.
Technical Approach
The proofs build on and extend methods from prior work on spectral limits for random graphs and uniform hypergraphs:
- Chatterjee's Lindeberg replacement scheme is generalized to accommodate the multivariate and dependent nature of the random adjacency matrices.
- The variance calculations for the semicircle law rely on precise asymptotic tracking of mixed covariances between different edge types.
- The limiting distribution is established via convergence of Stieltjes transforms, circumventing the inapplicability of the moment method due to variance deficit from spectral outliers.
Theoretical and Practical Implications
These results enrich the spectral theory of random hypergraphs by systematically handling non-uniformity in both edge sizes and inclusion probabilities. The explicit variance formula enables:
- Analytic predictions of spectral bulk for a large class of heterogeneous network models.
- Insights into regimes where classical semicircle laws fail or where spectral outliers may dominate, with consequences for community detection, percolation thresholds, or optimization on hypergraphs.
On a practical front, these findings can inform the design of hypergraph models in applied fields (e.g., neuroscience, chemistry, social networks) where higher-order and heterogeneous interactions are inherent.
Future Directions and Open Problems
While the bulk spectral behavior in the dense, non-sparse regime is characterized, several directions remain open:
- Sparse regime analysis: Identifying LSDs when all or most edge classes are sparse demands the machinery of local weak convergence, as in recent advances for graphs and uniform hypergraphs.
- Spectral outliers: Quantitative analysis of the so-called 'outlier' eigenvalues and their impact on network processes remains largely unaddressed.
- Extensions beyond co-occurrence matrices: Exploration of alternative matrix models (e.g., Laplacians, higher-order tensors) in the non-uniform context.
Implications for data science and AI—such as improved models for high-order relational data or more accurate synthetic network generation—are anticipated as spectral toolbox for non-uniform hypergraphs matures.
Conclusion
This work provides a rigorous, detailed characterization of the limiting spectral distribution for co-occurrence adjacency matrices of a wide class of non-uniform, inhomogeneous Erdős–Rényi hypergraphs. By connecting the LSD to explicit model parameters and establishing precise conditions for Gaussianization and semicircle limit, it advances both the mathematical foundations and applicability of random hypergraph theory in complex network modeling (2604.01877).