Hierarchical Lexical Graph (HLG)

Updated 4 August 2025

HLG is a formalism that employs multi-level graded graphs to represent complex lexical and semantic structures with clear hierarchical organization.
It integrates mathematical, logical, and probabilistic techniques to enable scalable inference, generative modeling, and robust multi-hop retrieval.
HLG applications span language theory, neural network architecture, and ontology mapping, offering improved accuracy and efficiency in various computational tasks.

A Hierarchical Lexical Graph (HLG) is a formal mathematical and computational construct designed to represent, manipulate, and reason about language structures, lexical relations, or complex hierarchical information via multi-level, graph-based formalisms. The HLG framework has been developed to address representation, inference, and generative needs in linguistic theory, retrieval-augmented generation, ontology mapping, machine learning architectures, and semantic representations, exploiting hierarchical organization, explicit lineage, topic clustering, and entity-relational connectivity.

1. Mathematical Foundations and Structural Properties

The core mathematical structure of an HLG is a multi-level, graded graph lineage, where each level (grade) encodes a specific abstraction or granularity of the represented lexicon or semantic content (Komatsu, 2021, Mjolsness et al., 31 Jul 2025). Specifically, given a lineage index $l$ , the graph at level $l$ is denoted as $G_l$ , with the number of vertices and edges typically exhibiting quasi-exponential growth:

$|V(G_l)|, |E(G_l)| = O(b^{l^{1+\epsilon}})$

for some growth parameter $b$ and arbitrarily small $\epsilon > 0$ (Mjolsness et al., 31 Jul 2025). Nodes are assigned non-negative integer grades, and edges connect either nodes within a grade or those differing by exactly one grade, formalized via a grading morphism $\varphi_G: G \to \hat{\mathbb{N}}$ with $\hat{\mathbb{N}}$ the canonical grade graph.

The levels are connected not only by intra-level (core) graphs but also by inter-level bipartite graphs, mapping nodes at level $l$ to nodes at level $l+1$ . These bipartite connections are encoded using sparse prolongation matrices $P^{(l+1, l)}$ , ensuring structured, multiscale connectivity and enabling the propagation of information or transformations across the hierarchy.

Algebraically, HLGs are assembled using skeletal variants of standard graph products—cross product ( $\hat{\times}$ ), box product ( $\hat{\Box}$ ), and disjoint sum ( $\oplus$ )—all modified to respect the graded nature of the vertices and the space-efficient constraints essential for tractable computation in high-resolution hierarchies (Mjolsness et al., 31 Jul 2025). For example, the skeletal cross product restricts edge formation to cases where the sum of component grades matches the target grade. Unary operators such as thickening $\theta$ and escalation further enable the creation of multiscale and search-frontier structures.

2. Logical and Proof-Theoretic Characterization

HLG’s relevance to language theory is underscored by its proof-theoretic characterization of mildly context-sensitive languages, notably tree-adjoining languages (TAL) (Komatsu, 2021). In this context, HLG is defined as a fragment of the Lambek-Grishin calculus, reified in display calculus with "purely structural connectives" and graph-theoretic constraints:

Standard logical connectives are extended (e.g., product, division, coproduct) to admit both logical and purely structural forms. For instance, $A \cdot B$ (standard) vs. $A \circ B$ (purely structural), with the latter ensuring strict control over proof transformations and planarity.
The central theorem aligns provability in HLG with planarity of proof nets: “Every proof $P$ in LG has a planar proof net if and only if $P$ is provable in HLG." Planarity is critical as it corresponds to the class of hyperedge-replacement (HR2) grammars known to be weakly equivalent to TAL.
Cut admissibility is preserved in HLG, assuring proof normalization and subformula property. Structural constraints, including rules on knots and planarity preservation, are enforced algorithmically to permit only those proof transformations corresponding to the target mildly context-sensitive generative capacity.

This establishes HLG as both a linguistic and logical tool: it precisely captures dependencies, such as cross-serial and non-context-free phenomena, while maintaining tractable, transparent proof-theoretic properties (Komatsu, 2021).

3. Graph-Based Methods for Learning and Generation

HLG has inspired a range of graph-based machine learning methodologies across domains—including semi-supervised classification, generative modeling, and retrieval augmentation—each exploiting the hierarchical properties of the framework.

Semi-Supervised and Hierarchical Graph Classification

In document modeling and classification (Li et al., 2022), HLGs underpin joint instance-higher-level learning where:

Document-level lexical graphs (e.g., graph-of-words) are embedded via permutation- and size-invariant GNNs.
Higher-level graphs are constructed via citation or metadata links between documents, with each node representing an embedded lexical graph.
Hierarchical Graph Mutual Information (HGMI) is maximized to encourage consistent semantic structure across both levels, decomposed as

$I(G; E; \Gamma) = \alpha \cdot [I(G;E) + I(E;\Gamma)]$

where $G$ : raw graph, $E$ : instance embeddings, $\Gamma$ : hierarchical predictions.

Iterative pseudo-labeling and cautious risk minimization are leveraged to enable learning under extreme label scarcity, with theoretical convergence guarantees.

Hierarchical Graph Neural Networks for Text

Semantic HLGs are also instantiated in layered GAT architectures (Hua et al., 2022): capturing information from word-level, sentence-level, and document-level graphs—each constructed and processed with attention to respective granularity and context. Classification accuracy benefits from this explicit breakdown, achieving up to 97.83% on benchmark datasets and outperforming flat, global-graph approaches.

Probabilistic and Manifold-Based Representations

Recent work leverages hierarchical projections and interpolation over lexical manifolds (Martus et al., 8 Feb 2025, Pendleton et al., 14 Feb 2025):

Tokens are projected onto multi-scale Riemannian manifolds, enforcing semantic/geometric smoothness with adaptive projections:

$P_h(e_i) = \sum_j \alpha_{ij} \exp(-\lambda \cdot d_{\mathcal{M}}(x_i, x_j)) e_j$

where $d_{\mathcal{M}}$ is the geodesic distance, and $\alpha_{ij}$ are adaptive weights.

Probabilistic function spaces and vector field interpolation define word embeddings as evolving points on continuous manifolds, regularized to minimize divergence from original context-induced distributions:

$D_{KL}(p_{\text{interp}} \parallel p_{\text{orig}}) = \int_{\mathcal{M}} p_{\text{interp}}(w) \log \frac{p_{\text{interp}}(w)}{p_{\text{orig}}(w)} d\mu(w)$

These methods yield improved semantic stability, density alignment, and robustness under perturbation compared to standard transformer-based discrete token embeddings.

4. Retrieval-Augmented Generation and Ontological Applications

The HLG paradigm is fundamental in multi-hop retrieval systems, most notably Hierarchical Lexical Graph for Enhanced Multi-Hop Retrieval (Ghassel et al., 9 Jun 2025), where:

A three-tier index underpins retrieval:
1. Lineage Tier: connects atomic propositions to their exact source, preserving full provenance.
2. Entity-Relationship Tier: encodes key entities and relations, allowing entity-based traversal of cross-document paths.
3. Summarization Tier: clusters propositions into latent topics for thematic/high-level retrieval.
StatementGraphRAG and TopicGraphRAG utilize fine-grained or topic-level graph traversals, leveraging entity-aware beam search and topic expansion, resulting in >23.1% relative improvement in retrieval recall and correctness over naïve chunk-based methods.
Synthetic benchmark pipelines are provided to rigorously evaluate multi-hop capabilities; practical open-source implementations are available.

In ontology mapping (e.g., SLHCat (Wang et al., 2023)), hierarchical, lexical, and semantic features are combined for large-scale alignment:

Class names are processed through root phrase extraction, lexical similarity using WordNet, and semantic embeddings (SimCSE/BERT).
Hierarchical inheritance propagates confident mappings from parent to descendant nodes, which, together with prompt-tuned BERT models, increases mapping accuracy by 25% over baselines.

5. Algebraic Type Theory and Theoretical Implications

HLG theory is grounded in an algebraic type theory for graded graphs (Mjolsness et al., 31 Jul 2025):

Graded graphs are treated as objects in a slice category over $\hat{\mathbb{N}}$ , enabling universal, type-safe definitions of core operations such as product, sum, and thickening.
Skeletal binary operators constrain combinatorial explosion by respecting grade-locality.
Pullback diagrams and universal properties establish the categorical structure needed for symbolic manipulation and analytic tractability.

Applications are demonstrated in:

Deep learning: constructing CNNs via the skeletal box product of spatial and feature lineages, yielding architectures with comparable accuracy and parameter efficiency to traditional CNNs, while being amenable to symbolic manipulation.
Numerical PDEs: skeletal multigrid solvers constructed from lineage products demonstrate competitive or improved convergence rates alongside resource savings.

Relevant formula: $|V(G_l)| = O(b^{l^{1+\epsilon}})$ encapsulates the scalable nature of these lineages, preserving multiscale (pyramidal) structure and supporting continuum-limit modeling via thickening and escalation.

6. Graph-Theoretic and Generative Modeling Perspectives

HLG underpins scalable graph generation frameworks (Karami, 2023), where:

Hierarchies are built by recursive clustering, with each level representing communities and super-nodes.
Probabilistic modeling is explicit: intra-community and inter-community subgraphs are generated via multinomial/based stick-breaking decompositions, enabling scalable, block-wise graph synthesis.
Modularity ensures parallelizable, locally-conditioned generation of large lexical graphs, making the approach suitable for lexical resources (e.g., wordnets, semantic networks) with both local and cross-domain relations.
The block-structural adjacency matrix and autoregressive factorization yield statistical accuracy and sampling efficiency improvements on diverse benchmarks.

7. Linguistic and Computational Significance

The HLG framework achieves several linguistically and computationally significant outcomes:

Proof-theoretic adequacy: precisely characterizes tree-adjoining and other mildly context-sensitive languages, positioning HLG as a foundational logic for syntactic phenomena previously unattainable by sequent or display calculi (Komatsu, 2021).
Compositional interpretability: separation of logical and purely structural connectives exposes the direct correspondence between graph transformations and syntactic dependencies.
Scalability across domains: modular, graded, and probabilistically regularized representations enable HLGs to be adapted to graph-based deep learning, retrieval, and ontology systems at industrial scale, as evidenced by open-source libraries and benchmark evaluations.
Robustness and stability: manifold-based and probabilistically interpolated representations enhance resilience to adversarial or noisy input, preserving semantic coherence.

A plausible implication, underscored by the analytic results and broad applicability, is that the HLG formalism—through its synthesis of algebraic, proof-theoretic, and probabilistic graph techniques—constitutes a unifying framework for bridging language theory, neural network modeling, and practical, scalable graph-based computation.