Papers
Topics
Authors
Recent
Search
2000 character limit reached

Structural Semantic Entropy (SeSE)

Updated 25 May 2026
  • Structural Semantic Entropy (SeSE) is a measure that quantifies uncertainty by analyzing semantic structures and interrelations among model outputs.
  • It extends traditional entropy methods by using graph-based estimators to capture hierarchical, directional, and continuous semantic relationships.
  • SeSE has practical applications in improving hallucination detection in LLMs and uncertainty quantification in tasks like QA, summarization, and translation.

Structural Semantic Entropy (SeSE) quantifies the uncertainty or diversity of a set of model outputs by measuring the richness of their underlying semantic structural organization. Unlike traditional entropy measures that assess variability in flat or unstructured output spaces, SeSE incorporates latent structural information—including hierarchical, directional, and topological relations—among the outputs, typically modeled via graphs constructed from semantic similarity or entailment. Recent work has demonstrated that SeSE and related graph-theoretic entropy formulations enable more nuanced uncertainty quantification, improved hallucination detection in LLMs, and reveal critical dynamics underlying adaptive reasoning and discovery in autonomous systems (Buehler, 24 Mar 2025, Nguyen et al., 30 May 2025, Zhao et al., 20 Nov 2025).

1. Mathematical Formulation of Structural Semantic Entropy

SeSE fundamentally extends Shannon-style entropy into the domain of structured semantic spaces, employing spectral, tree-based, or kNN-inspired estimators on graphs whose nodes represent model-generated answers and edges encode pairwise semantic relationships.

Core Formalism

Let G=(V,E,W)G=(V,E,W) be a semantic graph constructed from a set of responses or concepts, with VV the node set, EE the edges (directed or undirected), and WW the weights (e.g., entailment probabilities or semantic similarities). Associated with such a graph, several entropy measures have been defined:

Sstruct=i=1Npilogpi,pi=λijλjS_\text{struct} = -\sum_{i=1}^N p_i \log p_i, \quad p_i = \frac{\lambda_i}{\sum_j \lambda_j}

where {λi}\{\lambda_i\} are eigenvalues of the normalized Laplacian L=ID1/2AD1/2L = I - D^{-1/2}AD^{-1/2}; AA is the adjacency matrix; DD is the degree matrix.

  • Semantic Entropy (SE): Uses clustering or partitioning of responses by semantic similarity, with cluster probabilities either length-normalized or empirical:

SE(q)=k=1MPˉ(Ck)logPˉ(Ck)SE(q) = -\sum_{k=1}^M \bar{P}(C_k) \log \bar{P}(C_k)

where VV0 is the normalized mass of cluster VV1 (Nguyen et al., 30 May 2025).

  • Pairwise/Structural Estimators: Extend SE to continuous similarity metrics using nearest neighbor or LogSumExp forms over all pairs:

VV2

where VV3 is a semantic similarity function and VV4 is a temperature (Nguyen et al., 30 May 2025).

  • Hierarchical Directed Entropy (SeSE for LLM UQ):

VV5

where VV6 is an adaptively sparsified, directed semantic graph and VV7 is the minimized tree-encoded directed graph entropy (Zhao et al., 20 Nov 2025).

These measures aim to capture both the diversity of semantic outputs and the richness of their structural interrelations.

2. Structural Semantic Entropy in LLMs

State-of-the-art SeSE estimators have been employed to quantify uncertainty and hallucination propensity in LLM outputs, surpassing classical entropy and pairwise embedding baselines.

Semantic Graph Construction

  • Node set: Sampled LLM outputs (e.g., answers, summaries).
  • Edge weights: Computed via NLI-model entailment probabilities or cosine embedding similarities. Directionality encodes semantic dependency.
  • Graph sparsification: Adaptive selection of top-VV8 outgoing edges per node to form minimal but informative graphs, often enforced to be strongly connected Markov chains (Zhao et al., 20 Nov 2025).

SeSE Score Pipeline

  1. Generate candidate responses.
  2. Build and sparsify a directed semantic graph VV9.
  3. Compute the optimal encoding tree EE0 to hierarchically aggregate related responses, minimizing total entropy.
  4. Evaluate EE1 as structural entropy under EE2.

A higher SeSE reflects a structure that is less compressible and hence more uncertain—closely tied to hallucination probability in LLMs.

3. Theoretical Properties and Generalization

Structural Semantic Entropy frameworks unify and generalize prior entropy measures:

  • Limit reduction to flat SE/DSE: Special choices of the similarity function or clustering yield classical (discrete or soft) semantic entropy as special cases (Nguyen et al., 30 May 2025).
  • Incorporation of hierarchical, directional, and continuous structure: The optimal encoding tree and pairwise similarity matrix allow SeSE to account for both intra-cluster spread and inter-cluster proximity, capturing complexity overlooked by cluster-counting or flat pseudoprobability methods.
  • Directed entropy minimization: Enables explicit modeling of entailment directionality, essential for representing semantic asymmetries in argumentation, explanation, or question answering (Zhao et al., 20 Nov 2025).

These properties render SeSE strictly more expressive than earlier flat-structure measures.

4. Empirical Evidence and Applications

Hallucination Detection and Uncertainty Quantification

SeSE and related estimators (SNNE, WSNNE) have produced marked improvements in:

  • Binary QA (e.g., SQuAD, TriviaQA, BioASQ): AUROC gains of 6–15% relative to SE and kernel-based UQ, and ~3–5% over the next-strongest methods (Zhao et al., 20 Nov 2025).
  • Summarization and translation: Precision–recall ratios (PRR) and BERTScore metrics favor SeSE/SNNE substantially over SE/DSE (Nguyen et al., 30 May 2025).
  • Fine-grained claim-level uncertainty in long-form LLM outputs: SeSE enables the ranking of atomic claims by hallucination risk using a semantic-structural perspective.

Continuous Discovery in Agentic Systems

Agentic reasoning systems evolving over semantic graphs exhibit:

  • Persistent excess of semantic entropy (EE3; critical discovery parameter EE4) (Buehler, 24 Mar 2025).
  • Stable injection of “surprising” (cross-domain) semantic edges, facilitating self-organized criticality and continuous innovation.
  • Power-law degree distributions, small-world topology, and negative late-stage cross-correlation between semantic and structural entropy, all signaling adaptation at criticality.

These observations reveal SeSE’s utility as a monitoring signal and design axis for adaptive, open-ended reasoning architectures.

5. Algorithmic and Implementation Aspects

Complexity

  • Pairwise methods: EE5 for EE6 sampled outputs.
  • Graph construction and entropy minimization: For typical EE7, per-query wall-time is a few seconds on GPU for NLI-based edge computation and hierarchical encoding tree optimization (Zhao et al., 20 Nov 2025).

Core Components

Component Method Typical Techniques
Semantic Similarity Cosine, NLI entailment Sentence transformers, DeBERTa-v3-mnli
Graph Sparsification Top-EE8 selection Adaptive entropy minimization to choose EE9
Tree Encoding Optimization Greedy merging Greedy coupling, minimizing hierarchical directed entropy

Parameter Choices

  • Sampling: WW0 at WW1, one greedy decode at WW2.
  • Tree height: WW3–WW4 sufficient in practice.
  • NLI model: DeBERTa-v3-large-MNLI for entailment computations.

6. Broader Significance and Criticality Interpretation

SeSE measures offer principled, theoretically grounded metrics for:

  • Quantitative analysis of semantic “disorder” and information compressibility in model outputs.
  • Detection and prevention of hallucination, promoting abstention in high-uncertainty regimes.
  • Steering generative agents toward sustained innovation and criticality, leveraging entropy balance between structure and meaning (Buehler, 24 Mar 2025).

A pivotal implication is that targeting a regime where semantic entropy persistently exceeds structural entropy may underlie model adaptability and robust discovery. Practical reinforcement objectives can directly incorporate the critical discovery parameter WW5, semantic entropy, and “surprising edge” rate to maintain exploratory capacity.

7. Connections and Distinctions among SeSE Frameworks

Three recent SeSE-related formulations emphasize complementary facets:

  • Graph-theoretic SeSE (Buehler, 24 Mar 2025): Focus on structural vs. semantic entropy via graph Laplacians, criticality, and complex system parallels.
  • Pairwise/nearest-neighbor SNNE/WSNNE (Nguyen et al., 30 May 2025): Operationalize structural semantic entropy in LLM UQ by continuous pairwise similarity aggregation, generalizing cluster-based approaches.
  • Hierarchical directed SeSE for LLMs (Zhao et al., 20 Nov 2025): Capture directional, compressible structure in generated outputs; proves superior for fine-grained hallucination detection.

A plausible implication is that unified SeSE concepts can inform cross-disciplinary investigation—linking natural language generation, complex adaptive system theory, and scalable uncertainty quantification under a common structural-semantic information-theoretic lens.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Structural Semantic Entropy (SeSE).