Papers
Topics
Authors
Recent
Search
2000 character limit reached

Directed Hypergraph Configuration Model (DHCM)

Updated 20 February 2026
  • Directed Hypergraph Configuration Model is defined as a null model that preserves local marginals including node in/out degrees and hyperedge size distributions in directed hypergraphs.
  • It employs mathematical encodings, asymptotic enumeration, and efficient uniform sampling algorithms to analyze higher-order, non-pairwise network interactions.
  • The model extends classical configuration models and underpins rigorous hypothesis testing in diverse fields such as political networks, contagion dynamics, and economic complexity.

The Directed Hypergraph Configuration Model (DHCM) specifies a null model for randomization of directed hypergraphs under strict preservation of local marginals: node in- and out-degree sequences, and hyperedge head and tail size distributions. It generalizes the classical configuration model for graphs to the higher-order, directed, and potentially non-uniform setting, enabling statistically grounded hypothesis testing and structural analysis in complex systems with non-pairwise interactions.

1. Formal Definition and Core Properties

A directed hypergraph H=(V,E)H=(V,E) comprises a finite node set VV and a multiset EE of directed hyperedges. Each hyperedge eEe\in E is an ordered pair e=(he,te)e=(h_e,t_e), where heVh_e\subseteq V (head) and teVt_e\subseteq V (tail); heteh_e\cap t_e may be non-empty or restricted to be disjoint depending on the formulation (Preti et al., 2024, Greenhill et al., 2024). For every node vVv\in V, the out-degree dout(v)d_{\mathrm{out}}(v) is the number of heads containing vv, and the in-degree din(v)d_{\mathrm{in}}(v) is the number of tails containing vv. Each hyperedge ee is characterized by its head size shead(e)=hes_{\mathrm{head}}(e)=|h_e| and tail size stail(e)=tes_{\mathrm{tail}}(e)=|t_e|.

The DHCM ensemble H^DHCM\widehat{\mathcal{H}}^{DHCM} is the set of all hypergraphs on VV whose node in- and out-degree sequences and hyperedge head- and tail-size sequences exactly match those of a given observed hypergraph HH^\circ. Letting EE^\circ denote the edge multiset in the observed structure,

H^DHCM={H=(V,E)  |  vV:dout(H)(v)=dout(H)(v),  din(H)(v)=din(H)(v) eE:shead(H)(e)=shead(H)(e),  stail(H)(e)=stail(H)(e)}\widehat{\mathcal{H}}^{DHCM} = \left\{ H=(V,E) \;\middle|\; \begin{array}{l} \forall v\in V: d_{\mathrm{out}}^{(H)}(v)=d_{\mathrm{out}}^{(H^\circ)}(v), \; d_{\mathrm{in}}^{(H)}(v)=d_{\mathrm{in}}^{(H^\circ)}(v) \ \forall e\in E: s_{\mathrm{head}}^{(H)}(e)=s_{\mathrm{head}}^{(H^\circ)}(e),\; s_{\mathrm{tail}}^{(H)}(e)=s_{\mathrm{tail}}^{(H^\circ)}(e) \end{array} \right\}

The model assigns uniform probability to each HH^DHCMH\in\widehat{\mathcal{H}}^{DHCM}; that is, PDHCM(H)=1/H^DHCMP_{DHCM}(H) = 1/|\widehat{\mathcal{H}}^{DHCM}| for HH^DHCMH\in \widehat{\mathcal{H}}^{DHCM}, and zero otherwise (Preti et al., 2024).

2. Mathematical Encodings

Directed hypergraphs in the DHCM may be represented in several mathematically equivalent ways:

  • Incidence matrices: A+{0,1}V×EA^+\in\{0,1\}^{|V|\times|E|} with Av,e+=1A^+_{v,e}=1 iff vhev\in h_e; A{0,1}V×EA^-\in\{0,1\}^{|V|\times|E|} with Av,e=1A^-_{v,e}=1 iff vtev\in t_e.
  • Adjacency tensor: X{0,1}V×E×{+1,1}X\in\{0,1\}^{|V|\times|E|\times\{+1,-1\}}, Xv,e,dX_{v,e,d} indicates membership in head (d=+1d=+1) or tail (d=1d=-1).
  • Bipartite representation: As a bipartite, potentially directed, graph G=(L,R,D)G=(L,R,D) with L=VL=V, R=ER=E, and arcs DD labeled +1+1 (head) or 1-1 (tail). Degree preservation reduces to fixed-degree constraints on both sides of the bipartite graph (Preti et al., 2024, Greenhill et al., 2024).

3. Asymptotic Enumeration

Let N(d+,d,H,T)N(d^+,d^-,H,T) denote the number of (unlabeled) directed hypergraphs with prescribed node out-degree sequence d+d^+, in-degree sequence dd^-, head-size multiset HH, and tail-size multiset TT. Under the conditions

  • the sum of degrees matches the sum of hyperedge sizes (M+=di+=tjM^+=\sum d_i^+=\sum t_j, M=di=hjM^-=\sum d_i^-=\sum h_j),
  • the maxima dmax+,dmax,tmax,hmaxd^+_{\max}, d^-_{\max}, t_{\max}, h_{\max} all satisfy o(M1/2)o(M^{1/2}),
  • at least one of d+,d,T,Hd^+,d^-,T,H is "near-regular",

the cardinality is asymptotically

N(d+,d,H,T)=R(d+,d,H,T)exp[O(ϵ)]N(d^+,d^-,H,T) = R(d^+,d^-,H,T)\cdot\exp[O(\epsilon)]

with leading term

R(d+,d,H,T)=M+!M!i=1ndi+!i=1ndi!j=1mtj!hj!exp{M+K+2(M+)2+MK2(M)2M2+K2+2(M+)2M2K22(M)2i=1ndi+diM+M}R(d^+,d^-,H,T) = \frac{M^+! M^-!} {\prod_{i=1}^n d_i^+! \prod_{i=1}^n d_i^-! \prod_{j=1}^m t_j! h_j!} \exp\left\{ \frac{M^+ K^+}{2(M^+)^2} + \frac{M^- K^-}{2(M^-)^2} - \frac{M_2^+ K_2^+}{2(M^+)^2} - \frac{M_2^- K_2^-}{2(M^-)^2} - \frac{\sum_{i=1}^n d_i^+ d_i^-}{M^+M^-} \right\}

where K+=jtjK^+=\sum_j t_j, K=jhjK^-=\sum_j h_j, M2+=i(di+)2M_2^+=\sum_i (d_i^+)^2, K2+=j(tj)2K_2^+=\sum_j (t_j)^2, and ϵ=O((dmax++tmax)4/M++(dmax+hmax)4/M)\epsilon=O((d^+_{\max}+t_{\max})^4/M^++(d^-_{\max}+h_{\max})^4/M^-) (Greenhill et al., 2024). Dropping the near-regularity requirement is possible at the cost of larger error terms and stricter growth conditions.

Specializations include the standard configuration model when all hyperedges are single arcs (a+=a=1a^+=a^-=1 for all edges), yielding the classical directed graph formula.

4. Uniform Sampling Algorithms

Uniform sampling from the DHCM ensemble is made tractable by translating the problem to the bipartite graph domain. Two principal schemes have been detailed:

  • Markov chain Monte Carlo: The NuDHy-Degs algorithm [Editor's term], an ergodic Markov chain on the space of directed bipartite graphs with fixed in/out-degrees. The central move—Parity Swap Operation (PSO)—interchanges two +1+1-arcs (or 1-1-arcs) between distinct node-edge pairs, preserving all degrees. The process is reversible, aperiodic, and irreducible, guaranteeing convergence to the uniform distribution with each proposal accepted with probability 1.
  • Rejection/importance sampling: For sparse and bounded-degree regimes, an independent sampling of bipartite graphs G+,GG^+,G^- with required degree sequences is performed; the pair is accepted if their edge sets are disjoint. Under suitable bounds, the acceptance probability approaches 1, enabling constant-time uniform sampling (Greenhill et al., 2024).

Pseudocode for NuDHy-Degs is as follows:

1
2
3
4
5
6
7
8
9
10
11
Input: initial G∈S, number of steps s
For t = 1 … s:
    flip biased coin (heads prob |D⁺|/|D| for d=+1 else d=−1)
    if d=+1:
        pick u≠v∈L, compute Δ⁺(u,v)
        if Δ⁺ not empty:
            pick (α,β)∈Δ⁺ at random; swap (u,α,+1),(v,β,+1) ↔ (u,β,+1),(v,α,+1)
        else do nothing
    else:
        analogous process on R and −1 arcs
Return G
Each step operates in O(1)O(1) time if adjacency lists are hash sets, and typical mixing time is O(DlogD)O(|D|\log|D|) (Preti et al., 2024).

5. Model Scope, Extensions, and Limitations

The DHCM strictly preserves node degrees and hyperedge size marginals, but does not retain higher-order structure such as correlations between node degree and edge size, or specific motif frequencies. If such features are essential, the Directed Hypergraph Joint Model (DHJM) preserves the full five-dimensional joint tensor of node-degrees and hyperedge-sizes at the expense of a more complex and slower-mixing Markov chain on a restricted move set (Preti et al., 2024).

The current micro-canonical design does not account for statistical noise in observational data. For such purposes, a canonical-ensemble version, preserving marginals only in expectation via an exponential-random-hypergraph distribution, is suggested. Incorporation of additional constraints—such as reciprocity, motif counts, or annotated roles—would require novel modifications of the swap operations, as in the annotated-hypergraph configuration model (Preti et al., 2024).

6. Applications and Empirical Context

DHCM and its efficient sampling algorithms have been deployed in application areas where higher-order, directed interactions are central, including:

  • Quantification of political homophily oscillations in higher-order representations of US Congress data (Preti et al., 2024).
  • Analysis of non-linear contagion dynamics in contact hyper-networks, where deviations from mean-field predictions are only explained when joint degree distributions are considered (Preti et al., 2024).
  • Measurement of economic complexity in global trade networks, associating local marginals with established structural complexity indices (Preti et al., 2024).

The uniform null model generated by DHCM provides a rigorous reference for the assessment of structurally significant patterns in such contexts.

DHCM recovers, as special cases, established configuration models:

  • When all hyperedges are of arity (a+,a)(a^+,a^-), the formula and uniform sampler reduce to those for uniform dihypergraphs with given degree sequences (Greenhill et al., 2024).
  • For a+=a=1a^+=a^-=1, one recovers the classic directed configuration model for graphs.
  • By discarding direction and treating A+,AA^+,A^- symmetrically, one returns to the undirected hypergraph configuration model.

These connections affirm that the DHCM sits within a hierarchy of random network models, extending the configuration paradigm to general, multi-node, directed interactions.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Directed Hypergraph Configuration Model (DHCM).