High-Degree Preserving Graph Pruning

Updated 16 January 2026

High-degree preserving graph pruning is a method that retains critical hubs by explicitly preserving node degree distributions, unlike naïve thresholding.
Techniques such as the Marginal Likelihood Filter (MLF) and Global Likelihood Filter (GLF) assess edge significance to maintain overall network structure.
Adaptive pruning in privacy-preserving FHE GNNs reduces computational overhead while ensuring that vital network features and accuracy remain intact.

High-degree preserving graph pruning consists of a set of principled methodologies for reducing complex or noisy graphs to more informative subgraphs while explicitly preserving the degree (weighted or unweighted) distributions of the underlying nodes. This approach contrasts with naïve thresholding, which tends to fragment the graph and disproportionately penalize nodes with low degree, and is applicable to both empirical integer-weighted networks and privacy-preserving GNNs operating in encrypted domains. The central principle is the retention and prioritization of nodes and edges according to significance scores or degree statistics, ensuring that structural hubs maintain their connectivity and that large-scale network features remain intact. Prominent realizations include the Marginal Likelihood Filter (MLF) and Global Likelihood Filter (GLF) for empirical networks, as well as encrypted degree-based pruning in privacy-preserving graph inference (Dianati, 2015, &&&1&&&).

1. Null Models and Degree Statistics

High-degree preserving pruning begins with rigorous modeling of node degrees (or strengths). For an undirected graph $G=(V,E)$ with integer edge-weights $w_{ij}$ , the node strength is $k_i = \sum_{j\neq i} w_{ij}$ and total event weight $T=\frac{1}{2}\sum_{i}k_i=\sum_{i<j}w_{ij}$ (Dianati, 2015). In encrypted graph settings, the encrypted adjacency $\tilde{\mathbf{A}}\in\mathrm{CKKS}^{n\times n}$ yields encrypted row-sums $\tilde{s}_v = \bigoplus_{u\in\mathcal V} \tilde{A}_{vu}$ or vector form $\tilde{\mathbf{s}} = \tilde{\mathbf{A}} \otimes \tilde{\mathbf{1}}$ for node importance scoring (Zhao et al., 8 Jul 2025).

Both frameworks employ null models where $T$ unit-edges are assigned randomly, ensuring that average node strength is maintained and degree distributions are preserved in expectation. In the configuration-style null model, the probability that exactly $m$ unit-edges fall between $i$ and $j$ is $\Pr[\sigma_{ij}=m] = {T \choose m}p^m (1-p)^{T-m}$ , with $p = k_i k_j/2T^2$ .

2. Marginal Likelihood Filter (MLF): Edge Significance and Pruning

MLF evaluates the statistical significance of individual edges via their deviation from the null model distribution (Dianati, 2015). For edge $(i,j)$ , the $p$ -value is

$s_{ij}(w_{ij}) = \sum_{m\ge w_{ij}} \Pr[\sigma_{ij}=m|k_i,k_j,T] = 1 - \mathrm{BinCDF}(w_{ij}-1; T, p_{ij}),$

where $p_{ij} = k_i k_j/(2T^2)$ . Edges are sorted by ascending significance $s_{ij}$ , and retained if $s_{ij}\leq \alpha$ for a chosen threshold $\alpha$ (e.g., $\alpha=0.01$ ) or by keeping a top fraction $p_{\mathrm{keep}}$ .

This filter preserves the degree sequence in expectation and avoids isolation of low-degree nodes. Empirical network analyses demonstrate that MLF yields sparser yet globally connected subgraphs, recovers regionally faithful layouts (as in air traffic networks), and disentangles overlapping clusters better than strict weight thresholding (Dianati, 2015).

3. Global Likelihood Filter (GLF) and Correlated Pruning

GLF generalizes significance filtering to the entire subgraph, formulating the likelihood of observing the full weighted graph $G' = \{\sigma_{ij}\}$ under the null ensemble:

$P(G') = \frac{1}{Z} \frac{ (\sum_{i<j}\sigma_{ij})! }{ \prod_{i<j} \sigma_{ij}! } \exp\left[ -\sum_{i<j} (\theta_i+\theta_j)\sigma_{ij} \right].$

Here, Lagrange multipliers $\theta_i$ are set to match empirical strengths $k_i$ , and the global pruning criterion seeks a $m'$ -edge subgraph that minimizes $-\log P(G')$ . Monte Carlo search (e.g., Metropolis–Hastings) is used to efficiently traverse the combinatorial space, accounting for edge correlations (Dianati, 2015).

In practical evaluations, the Jaccard similarity between MLF and GLF retained-edge sets exceeds 80% even under severe pruning, indicating that MLF approximates the globally optimal selection for many networks.

4. Encrypted High-Degree Preserving Pruning in FHE GNNs

Within privacy-preserving GNN inference under fully homomorphic encryption (FHE), high-degree preserving pruning is engineered to minimize redundancy and computational overhead while maintaining accuracy (Zhao et al., 8 Jul 2025). The pipeline entails:

Calculation of encrypted degree statistics $\tilde{s}_v = \mathrm{Enc}(\deg(v))$ using CKKS primitives.
Partitioning nodes into $m+1$ groups using descending thresholds $\tau_1>\cdots>\tau_m$ via approximate comparison ( $HE.AprxCmp$ ), generating multi-level importance masks $\tilde{M}_{i,v}$ .
Logical pruning: Keep mask $\tilde{K} = \tilde{1} \ominus \tilde{M}_0$ zeroes out low-importance nodes in features and adjacency.
Adaptive activation: Assignment of polynomial degree approximations $P_{d_i}(z)$ in nonlinearities, with higher-degree polynomials reserved for nodes with high preserved degree.

Homomorphic implementations use only additions and a small number of multiplications and rotations per node, keeping efficacy high within FHE constraints.

5. Empirical Validation and Trade-offs

Extensive benchmarking of MLF and GLF on the US airport network and occupation co-occurrence datasets highlights key structural trade-offs (Dianati, 2015):

At fraction $r$ of edges kept near $0.5$, MLF retains $>90\%$ of nodes in the largest component; thresholding fragments the graph.
MLF subgraphs unfold coherent geographic or semantic structures, while thresholded graphs remain dense and tangled.
Clique numbers and clustering coefficients decrease with pruning, but connectivity is preserved for hubs.

In DESIGN (Zhao et al., 8 Jul 2025), encrypted GNNs benefit from similar trade-offs:

Pruning up to $40$– $50\%$ of nodes reduces inference latency by over $20\%$ at a cost of only $1$– $2\%$ accuracy loss (Cora dataset).
Adaptive polynomial selection yields an additional $20$– $30\%$ savings in homomorphic multiplications.
Full pipeline (pruning + adaptive activation) yields $2.0$– $2.4\times$ speedup over SEAL with accuracy competitive to optimized FHE GNNs.

Representative empirical results:

Prune (%)	Accuracy (%)	Latency (s) (Cora, DESIGN)
10	76.7	93.5
40	74.2	70.0
70	65.7	48.9
90	47.1	37.4

A plausible implication is that high-degree preserving approaches maintain the integrity of global network features while enabling significant computational reductions, in both plaintext and encrypted settings.

6. Algorithmic Complexities and Practical Guidelines

MLF has time complexity $O(|E|\log |E|)$ , dominated by edge sorting; individual binomial tests are $O(1)$ or approximable for large $T$ . GLF's complexity is $O(N_{\mathrm{iter}})$ per Monte Carlo swap, efficient when factorials are pre-tabulated.

DESIGN’s FHE pipeline leverages only CKKS-compatible primitives (HE.Add, HE.Mult, HE.Rotate), suitable for real-time inference in privacy-sensitive domains (Zhao et al., 8 Jul 2025).

Guidelines include tuning thresholds to maintain the giant component above $80$– $90\%$ for connectivity, and using clustering measures to select appropriate pruning points. For encrypted graphs, threshold selection and adaptive scheme configuration directly impact inference efficiency and accuracy.

7. Comparison to Thresholding and Degree-Sequence Effects

A fundamental distinction is that both MLF and GLF explicitly preserve degree sequences ( $\mathbb{E}[\sum_j \sigma_{ij}] = k_i$ ), ensuring hubs remain central and avoiding systemic bias against low-degree nodes (Dianati, 2015). Naïve weight thresholding, however, discards edges indiscriminately below cutoff, disconnecting low-strength nodes and often producing fragmented or misleading topologies.

In encrypted GNNs, partitioning by encrypted degree ensures that important nodes are never pruned solely due to uniform strategies, maintaining fidelity with underlying graph semantics (Zhao et al., 8 Jul 2025). This suggests that degree-preserving pruning is preferable for most real-world analytic and predictive tasks where global connectivity and hub dynamics are essential.

Markdown Report Issue Upgrade to Chat

References (2)

Unwinding the hairball graph: pruning algorithms for weighted complex networks (2015)

DESIGN: Encrypted GNN Inference via Server-Side Input Graph Pruning (2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to High-Degree Preserving Graph Pruning.

High-Degree Preserving Graph Pruning

1. Null Models and Degree Statistics

2. Marginal Likelihood Filter (MLF): Edge Significance and Pruning

3. Global Likelihood Filter (GLF) and Correlated Pruning

4. Encrypted High-Degree Preserving Pruning in FHE GNNs

5. Empirical Validation and Trade-offs

6. Algorithmic Complexities and Practical Guidelines

7. Comparison to Thresholding and Degree-Sequence Effects

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

High-Degree Preserving Graph Pruning

1. Null Models and Degree Statistics

2. Marginal Likelihood Filter (MLF): Edge Significance and Pruning

3. Global Likelihood Filter (GLF) and Correlated Pruning

4. Encrypted High-Degree Preserving Pruning in FHE GNNs

5. Empirical Validation and Trade-offs

6. Algorithmic Complexities and Practical Guidelines

7. Comparison to Thresholding and Degree-Sequence Effects

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research