AS-DSG: Adaptive Sparsification for Directed Semantic Graphs

Updated 27 November 2025

The paper introduces adaptive sparsification frameworks that optimize directed semantic graphs through entropy-minimizing kNN pruning, GST, and spectral methods to reduce redundancy while preserving key properties.
The methodology leverages natural language inference scores and learned graph maskers to compute directional edge weights, ensuring scalable performance and robust connectivity.
Empirical results demonstrate significant edge reduction (down to 1.67% density) and up to 3.4× inference speedup, all while maintaining semantic consistency and spectral integrity.

An adaptively sparsified directed semantic graph (AS-DSG) is a structured representation of semantic relationships among entities—such as LLM outputs, knowledge graph elements, or labeled data points—where edge set sparsity and direction are optimized to preserve essential semantic, spectral, and topological information, while minimizing redundancy and computational cost. AS-DSG techniques span uncertainty quantification in LLMs, graph neural network acceleration, and scalable spectral graph algorithms, and formalize a key element in quantifying and structuring semantic information (Zhao et al., 20 Nov 2025, Zhang et al., 2 Feb 2024, Zhang et al., 2018).

1. Formal Definitions and Directed Semantic Graph Construction

An adaptively sparsified directed semantic graph is defined as $G_{\mathrm{dir}} = (V, E, W)$ , where $V$ is a set of nodes (e.g., sampled responses, labeled graph vertices), $E \subseteq V \times V$ is a set of ordered edges capturing semantic directionality (e.g., entailment, causation), and $W: E \to \mathbb{R}_{\ge 0}$ assigns each edge a nonnegative weight. The semantics of $W$ and $E$ are domain-specific: in LLM uncertainty quantification, $W(i, j)$ captures directional entailment strength between outputs $r^i \to r^j$ (Zhao et al., 20 Nov 2025); in labeled knowledge graphs, $W$ reflects frequency or predicate-dependent confidence (Zhang et al., 2018); in neural network-augmented graphs, $W$ is a learned function of node embeddings (Zhang et al., 2 Feb 2024).

Directed Laplacians for AS-DSGs are constructed as $L = D_{\mathrm{out}} - A^\top$ , with $D_{\mathrm{out}}$ a diagonal out-degree matrix and $A$ the weighted (possibly predicate-mixed (Zhang et al., 2018)) adjacency. Sparsification proceeds from dense or complete initial constructions.

2. Pairwise Semantic Measurement and Edge Weighting

In applications such as structural uncertainty quantification for LLMs, AS-DSG edge construction starts from pairwise semantics. For sampled outputs $\mathcal{R} = \{r^1, \ldots, r^N\}$ to query $x$ , the algorithm measures, for each ordered pair $(i, j)$ , the directional semantic entailment using a natural language inference (NLI) model to yield probability scores for entailment, contradiction, and neutral relations: $P_{\mathrm{NLI}} (r^i \to r^j \mid x) = (p_e, p_n, p_c)$ . Edge weights are then calculated as $A_{ij} = p_e + \frac{1}{2}p_n$ , incorporating both direct entailment and partial neutrality, while contradiction contributes nothing (Zhao et al., 20 Nov 2025).

In graph neural network contexts, semantic weights are derived from a parameterized graph masker $\Phi$ , which operates on node embeddings $h_u, h_v$ via $m_{uv} = \Phi([h_u \| h_v]; \psi)$ , and is optimized by supervised learning to yield dense semantic anchor masks (Zhang et al., 2 Feb 2024).

3. Adaptive Sparsification Principles and Algorithms

Adaptive sparsification in directed semantic graphs has two key frameworks:

Entropy-Minimizing kNN Pruning: Starting from a dense directed adjacency $A$ , for each $k=1,\ldots,N-1$ , construct $G_k$ by retaining the top- $k$ outgoing edges per node. Apply the Adjusting Operator to enforce strong connectivity and renormalize rows so each forms a stochastic matrix. Compute the one-dimensional structural (random-walk) entropy $H^1(G_k) = -\sum_{v=1}^N \pi_k(v)\, \log_2(\pi_k(v))$ , with $\pi_k$ the stationary distribution of the Markov chain induced by $G_k$ . Select $k^\star = \arg\min_k H^1(G_k)$ : the sparsest strongly connected graph minimizing uncertainty in community assignment (Zhao et al., 20 Nov 2025).
Semantic/Topological Anchor Matching (GST): Graph Sparse Training (GST) for neural networks treats sparsification as a constrained optimization. It minimizes edge count $‖M‖_0$ subject to preserving both semantic outputs ( $\operatorname{KL}[f(A \odot M, X; \Theta), Z^A] \leq \varepsilon_s$ ) and spectral/topological anchors ( $\sum_{k \in I} |\lambda_k(L^A) - \lambda_k(L(M))| \leq \varepsilon_t$ ). Iterative drop/regrow rewiring alternately prunes and regrows edges according to a combined criterion $\emptyset(e) = B_s \phi_{\mathrm{sema}}(e) + B_t \phi_{\mathrm{topo}}(e)$ , where $\phi_{\mathrm{sema}}$ is the impact on semantic consistency and $\phi_{\mathrm{topo}}$ approximates resultant eigenvalue shifting (Zhang et al., 2 Feb 2024).
Spectral Sparsification: For semantic knowledge graphs, theoretical frameworks guarantee the existence of linear-sized spectral sparsifiers that preserve Laplacian quadratic forms up to $(1\pm\epsilon)$ for all $x$ . Edges are sampled or added adaptively based on “directed effective resistance” or spectral sensitivity, facilitating incremental sparsification as semantic labels or weights change (Zhang et al., 2018).

4. Connectivity, Spectral Integrity, and Pruning Operations

AS-DSGs must ensure global semantic and spectral properties are preserved despite aggressive sparsification:

The Adjusting Operator algorithm applies Tarjan’s algorithm to identify strongly connected components, inserts minimal “tiny” edges to achieve strong connectivity if necessary, and row-normalizes the adjusted adjacency (Zhao et al., 20 Nov 2025).
In spectral methods, edge addition or deletion is conservatively managed, with removals delayed unless the change to $x^\top L_S x$ is bounded by a permissible $\epsilon$ -fraction (Zhang et al., 2018). Global re-sparsification is triggered when spectral discrepancy accumulates.
For GST, eigenvalue shifts from edge perturbations are approximated in a first-order regime ( $\Delta\lambda_k \approx u_k^\top (\Delta L) u_k$ ), guiding topological scoring and update scheduling (Zhang et al., 2 Feb 2024).
Edges that consistently yield higher structural entropy (“negative interference”) or fail to preserve either semantic output or spectral anchors are pruned.

5. Theoretical Motivation for Directionality and Adaptive Sparsification

Many semantic relationships—including entailment, generalization, and causation—are fundamentally asymmetric and cannot be faithfully modeled using undirected or symmetrized graphs. Directed construction allows asymmetric semantic “flow,” formalizing reasoning properties such as $r^i$ entails $r^j$ without $r^j$ necessarily entailing $r^i$ (Zhao et al., 20 Nov 2025).

Further, adaptive sparsification, either by entropy minimization or spectral criterion, filters redundant semantic connections, prevents over-smoothing or performance collapse at high sparsity, and yields subgraphs that remain globally traversable and spectrally stable. The balance between minimal connectivity required for strong component structure and maximal pruning for computational tractability is a central feature of all effective AS-DSG frameworks (Zhao et al., 20 Nov 2025, Zhang et al., 2 Feb 2024, Zhang et al., 2018).

6. Computational Complexity and Scalability

Complexity varies with the sparsification framework:

Method	Key Complexity Terms	Scaling Features
Entropy-Minimizing kNN (AS-DSG)	$O(N^2)$ for NLI scoring, $O(kN)$ for top- $k$ pruning	Practical for moderate $N$
GST (GNN training) (Zhang et al., 2 Feb 2024)	$O(E\cdot(Lm + Dn))$ (anchor), $O(m \log m)$ (pruning), $O(L m)$ per epoch	Sublinear spectral updates, scalable to large graphs
Spectral Sparsification (Zhang et al., 2018)	$O(n/\epsilon^2)$ edges, Laplacian solver iterations	Nearly-linear time, incremental updates

GST discovers subgraphs at up to $1.67\%-53\%$ original edge density with no accuracy loss and up to $3.4\times$ speedup in inference. Spectral sparsifiers maintain quadratic form guarantees and can be incrementally updated on streaming graphs.

7. Empirical Results and Applications

In LLM uncertainty quantification, the AS-DSG underpins the SeSE entropy metric, outperforming alternatives by leveraging sparsified, directed structural representations and enabling fine-grained hallucination detection (Zhao et al., 20 Nov 2025).
GST-based AS-DSGs maintain both semantic consistency and topological integrity at extreme sparsity (down to $1.67\%$ edge density), halve spectral anchor error compared to single-head methods, and improve network robustness under adversarial or random perturbation (Zhang et al., 2 Feb 2024).
In knowledge graph and spectral numerical contexts, spectral sparsifiers reduce edge count to $O(n/\epsilon^2)$ , adapt to semantic label changes, and provide end-to-end spectral error bounds (Zhang et al., 2018).

A plausible implication is that AS-DSG principles enable unified graph representations suitable for diverse applications—semantic quantification, scalable learning, numerical solvers—without sacrificing fidelity to underlying semantics or structural guarantees.

References:

(Zhao et al., 20 Nov 2025) SeSE: A Structural Information-Guided Uncertainty Quantification Framework for Hallucination Detection in LLMs
(Zhang et al., 2 Feb 2024) Two Heads Are Better Than One: Boosting Graph Sparse Training via Semantic and Topological Awareness
(Zhang et al., 2018) A Unified Approach to Scalable Spectral Sparsification of Directed Graphs