Hierarchical Concept Geometry

Updated 27 May 2026

Hierarchical concept geometry is the study of mapping abstract concept hierarchies into geometric spaces using spectral, linear, and metric models.
It applies spectral kernels, cone embeddings, hyperbolic, and spacetime representations to reveal underlying semantic structures and hierarchical relationships.
Empirical results demonstrate improved interpretability, efficient search, and robust semantic reasoning in language models and knowledge graphs.

Hierarchical concept geometry is the study of how abstract concepts and their semantic relations—particularly hierarchical relations such as hypernymy, partonomies, and ontological containment—are encoded, represented, and manipulated within geometric spaces. The field integrates methodologies from spectral graph theory, Riemannian geometry, topological data analysis, and modern neural embedding techniques to formalize and exploit the geometry underlying concept hierarchies in LLMs, knowledge graphs, and high-dimensional data. Its central aim is to provide a rigorous mathematical and empirical account of how hierarchy is mapped to geometric structure, with implications for interpretability, efficient search, and semantic reasoning.

1. Foundations: Spectral, Linear, and Metric Models

Hierarchical relations in concept sets are commonly represented as trees or directed acyclic graphs (DAGs), with nodes encoding concepts and edges enacting "is-a," "part-of," or related taxonomic links. Several formal frameworks have emerged for mapping this symbolic structure to geometry:

Spectral organization: Co-occurrence statistics from large-scale text corpora, filtered through taxonomic graphs such as WordNet, define pairwise similarity kernels. Under the assumption that co-occurrence decays with semantic distance on a tree (i.e., $K(i,j) = f(dist(i,j))$ with $f$ positive, strictly decreasing), the Gram matrix of normalized co-occurrences acquires a distinct eigenspectrum. The leading eigenvectors separate broad branches (e.g., "animal" vs. "plant"), with subsequent components recursively splitting finer branches, producing a coarse-to-fine spectral geometry that mirrors the hierarchy itself (Nava et al., 22 May 2026). The block-diagonalization matches the decomposition into scaling and wavelet subspaces associated with levels and splits of the tree.
Linear representation hypothesis: In neural models, especially LLMs, binary and categorical concepts correspond to directions or polytopes in representation space. Hierarchical inheritance induces strict geometric orthogonality: if $w_0 \prec w_1 \prec w_2$ is a chain in the hierarchy, successive difference-vectors are orthogonal, and categorical subspaces are direct sums of simplices orthogonal to their ancestors. Empirical tests confirm these predictions for over 900 WordNet-derived concepts in Gemma and LLaMA-3 (Park et al., 2024).
Metric cones, polar geometry, and hyperbolic manifolds: Metric cone models $C^\beta(Z)$ provide a canonical embedding of hierarchy, with a scalar "height" $s$ capturing depth from the root. The distance from the apex to node encodes hierarchical level, while the angular component preserves lateral similarity. Metric cones can be constructed over arbitrary base spaces (Euclidean, hyperbolic, etc.), enabling both joint learning and extraction of hierarchical depth from pre-trained embeddings (Takehara et al., 2021). Polar and spherical frameworks (e.g., the Polaris system (Mishra et al., 30 Apr 2026)) decouple depth and semantics, mapping hierarchy to radius and ontological content to angular directions.

2. Geometric Embedding Schemes for Hierarchy

Several geometric embedding paradigms are used for encoding hierarchical data:

Model	Base Geometry	Hierarchy Encoding
Spectral tree kernel (Nava et al., 22 May 2026)	Implicit (Gram matrix)	Spectral splits per tree levels
Cone embedding (Takehara et al., 2021)	Metric cone over Z	Scalar height = depth
Polaris (polar) (Mishra et al., 30 Apr 2026)	Hypersphere	Radius = depth; angle = meaning
Hyperbolic/HyHTM (Shahid et al., 2023)	Poincaré ball	Distance to origin (approximate)
Spacetime (Anabalon et al., 7 May 2025)	3D Minkowski	Timelike separation encodes order

Spectral models: Use distance-controlled kernels on trees to induce block-diagonal Gram matrices whose eigenvectors reflect splitting at every hierarchical level. The learnable geometry is entirely dictated by the spectral properties of pairwise statistics (Nava et al., 22 May 2026).

Cone models: Each concept $u$ is assigned a coordinate $(z_u, s_u)$ in $C^\beta(Z)$ , with $s_u$ indicating depth. Apex ( $s=0$ ) represents the root. The cone metric guarantees unique, interpretable hierarchical positions and negative curvature suitable for trees (Takehara et al., 2021).

Polar/spherical models: Separating semantic content from depth, Polaris assigns a radius $f$ 0 tree depth (root at $f$ 1, leaves near $f$ 2), and angle $f$ 3 for semantics. Spherical layers and regularization enforce level-based separation and stable containment (Mishra et al., 30 Apr 2026).

Hyperbolic geometry: HyHTM utilizes the Poincaré ball, where exponential expansion near the boundary naturally encodes tree branching. However, the origin is arbitrary, and hierarchy may be ambiguous if not regularized (Shahid et al., 2023). Cone and polar models address this with explicit radial variables.

Spacetime embeddings: In (Anabalon et al., 7 May 2025), hierarchies are perfectly embedded in 3D Minkowski space, where causality relations encode the partial order: a child lies in the future lightcone of its parent, and ancestor-descendant queries become lightcone or proper time calculations.

3. Mathematical Properties and Theoretical Guarantees

Spectral theory: Theorems establish that a symmetric distance-controlled kernel $f$ 4 on a tree yields eigenvectors (scaling and wavelet modes) precisely aligned with the tree's splits. The hierarchy level is mapped to eigenspace order, with block structure enforcing coarse-to-fine separation (Nava et al., 22 May 2026).

Orthogonality and direct-sum structure: For feature vectors in LLMs, parent-child relations manifest as orthogonal differences, and hierarchical categorical concepts are realized as simplices in orthogonal subspaces. This guarantees that manipulating one semantic branch does not affect others, matching strict taxonomic modularity (Park et al., 2024).

Metric cones: The identifiability of cone heights is guaranteed for generic configurations (uniqueness for $f$ 5, and for $f$ 6 when base points are well separated). Cone construction strictly increases negative curvature, making it optimal for tree-like structures (Takehara et al., 2021).

Causal structure: Spacetime embeddings only require local parent-child signals, and the algorithm is guaranteed to converge to a perfect embedding (mean rank = 1, MAP = 1) for large taxonomies, including all of WordNet (Anabalon et al., 7 May 2025).

4. Empirical Results and Case Studies

LLMs: Empirical eigenspace alignment between theoretical and learned embeddings (word2vec, Gemma) on sampled WordNet subtrees is consistently high, and principal components visually reflect coarse-to-fine hierarchical contrasts (e.g., "animal" vs. "plant," then "mammal" vs. "bird") (Nava et al., 22 May 2026).
LLMs and concept manifolds: Over 90% of evaluated WordNet features are cleanly separated in unembedding-projection space in Gemma and LLaMA-3. Simplex geometry of categorical concepts and the orthogonal decomposition of subspaces are directly observed (Park et al., 2024).
Retrofit and plug-in hierarchy extraction: Cone embedding retrofitted to arbitrary pre-trained embeddings (e.g., GCN, LINE, Poincaré) can efficiently learn principled, interpretable height variables $f$ 7, achieving better tree-reconstruction metrics than previous hyperbolic approaches (Takehara et al., 2021).
Spacetime representations: In 3D Minkowski embeddings, hierarchies up to 80,000+ tokens are reconstructed perfectly, outperforming high-dimensional hyperbolic baselines on mean rank and MAP (Anabalon et al., 7 May 2025).
Topic models and TDA: Hyperbolic geometry-based topic models (HyHTM) yield more coherent and granular topic trees than Euclidean NMF or LDA variants, while geometric/topological pipelines using persistent homology extract stable, interpretable concept clusters in object compositional data (Shahid et al., 2023, Mueller et al., 2018, Aloni et al., 2021).

5. Practical Implementations and Algorithms

Spectral/embedding pipelines: Construct normalized co-occurrence matrices from corpus statistics, fit kernel $f$ 8 (typically exponential), diagonalize to obtain Gram, project onto leading eigenvectors for coarse-to-fine geometry (Nava et al., 22 May 2026).
Cone embedding: Either learn both $f$ 9 and $w_0 \prec w_1 \prec w_2$ 0 jointly via Riemannian SGD, or fit $w_0 \prec w_1 \prec w_2$ 1 and $w_0 \prec w_1 \prec w_2$ 2 on fixed $w_0 \prec w_1 \prec w_2$ 3. Plug-in mode allows fast augmentation of any prior embedding with hierarchical structure. The update for $w_0 \prec w_1 \prec w_2$ 4 and $w_0 \prec w_1 \prec w_2$ 5 is closed form; negative-sampling loss is used for scalability (Takehara et al., 2021).
Polar/hyperspherical approaches: Project latent vectors to tangents at the pole, map to the sphere, enforce containment via triplet geodesic losses and variational mean-field regularization. Structure-guided retrieval first prunes on radius, then ranks by angular similarity (Mishra et al., 30 Apr 2026).
Spacetime geometry: Fast local update algorithm adjusts time coordinates to achieve causal constraints. Querying for parental or descendant nodes is reduced to searching within the past/future lightcone, optionally using minimal proper time (Anabalon et al., 7 May 2025).
Topological data analysis: Persistent homology is applied to stimulus or sample vectors from segmented objects or datasets, yielding birth-death diagrams. Stable clusters at meaningful filtration scales correspond to robust "shape concepts" (Mueller et al., 2018, Aloni et al., 2021).

6. Applications, Interpretability, and Future Directions

Hierarchical concept geometry underpins tasks in semantic search, ontology induction, taxonomy expansion, and robust natural language understanding. Explicit geometric separation of concept branches enables safe "concept surgery" (editing model beliefs in a modular way), improved interpretability (by extracting orthogonal subspaces or simplex polytopes), and sample-efficient learning (by imposing geometric inductive biases).

Sparse coding and concept extraction: Rather than searching for overlapping or collinear features, the geometry prescribes searching for orthogonal simplices and their normals, suggesting new classes of taxonomy-aware sparse coding architectures (Park et al., 2024).
Efficient search and retrieval: Radius/height coordinates (cone, polar) or causal time make hierarchical search sublinear, as candidate sets can be dramatically pruned before finer-grained similarity scoring.
Robustness and transfer: Conformal spacetime and cone constructions admit generalization across very different data domains, as seen in cross-dataset shape-concept transfer (Anabalon et al., 7 May 2025, Mueller et al., 2018).
Model design and the geometry of generalization: The conformity between LLM concept geometry and theoretical predictions, even without explicit hierarchical signal, supports the claim that the spectral structure of pairwise co-occurrence suffices for emergent hierarchy. Potential future directions include architectures that bias toward direct-sum/simplicial-orthogonality, hybrid topological-geometric concept representations, and curvature-adaptive metric learning for arbitrary DAGs and multimodal taxonomies.

7. Summary Table: Geometric Frameworks and Their Properties

Framework	Hierarchy Variable	Base Space/Metric	Unique Features	Representative Reference
Spectral tree kernel	Eigenvector index	Empirical Gram	Coarse-to-fine spectral splitting, no tuning needed	(Nava et al., 22 May 2026)
Linear direction/simplices	Linear spans	$w_0 \prec w_1 \prec w_2$ 6	Orthogonality, direct sum decomposition	(Park et al., 2024)
Cone embedding	Height $w_0 \prec w_1 \prec w_2$ 7	$w_0 \prec w_1 \prec w_2$ 8	Explicit apex/root, extends any embedding	(Takehara et al., 2021)
Polar/Polaris	Radius $w_0 \prec w_1 \prec w_2$ 9	Sphere $C^\beta(Z)$ 0	Separates depth and content, robust to noise	(Mishra et al., 30 Apr 2026)
Hyperbolic/HyHTM	Distance to origin	Poincaré ball	Exponential expansion, supports topic trees	(Shahid et al., 2023)
Spacetime embedding	Time $C^\beta(Z)$ 1	Minkowski $C^\beta(Z)$ 2	Causal/lightcone order, perfect reconstruction	(Anabalon et al., 7 May 2025)
TDA (Persistent homology)	Filtration scale	Simplicial complex	Topological clusters, no parameter tuning	(Aloni et al., 2021, Mueller et al., 2018)

In conclusion, hierarchical concept geometry brings together spectral theory, Riemannian geometry, convex analysis, and topological data methods to provide principled, rigorous models for the representation, discovery, and manipulation of concept hierarchies. Geometric encodings now underpin both the interpretability of LLMs and practical advances in search, ontology construction, and semantic modeling. Ongoing research continues to refine the links among symbolic, metric, and topological perspectives on conceptual structure.