Topological Positional Encodings
- Topological positional encodings are coordinate systems derived from intrinsic non-Euclidean structures, capturing connectivity, cycles, and hierarchy in data.
- They utilize spectral, random-walk, and diffusion-based techniques to embed position awareness into neural architectures like GNNs and transformers.
- Recent advances extend these encodings to higher-order, quantum, and Lie-algebraic methods, significantly boosting learning, generalization, and expressivity.
Topological positional encodings are geometric or combinatorial coordinate systems embedded into neural architectures—most prominently graph neural networks (GNNs), transformers for structured data, and related architectures—that endow each basic element (node, edge, high-order cell, patch, or feature) with a position- or structure-aware vector. Unlike sequential or grid-based encodings, topological encodings are derived from the intrinsic structure of non-Euclidean data—graphs, complexes, manifolds, or estimated latent graphs—enabling models to capture connectivity, cycles, hierarchy, and other structural features while preserving permutation equivariance. Their design draws upon spectral graph theory, random walks, homology, and, in advanced cases, quantum dynamics or Lie group symmetries. Recent research demonstrates these encodings are crucial for restoring or preserving topological information in architectures that are fundamentally indifferent to data geometry, such as transformers and high-order GNNs, and their judicious incorporation greatly improves learning, generalization, and expressivity.
1. Fundamental Classes and Theoretical Equivalence
The topological positional encoding paradigm bifurcates into absolute positional encodings (APEs) and relative positional encodings (RPEs).
- APEs assign a vector to each element (e.g., node, patch, feature) based on the global structure. Examples: Laplacian eigenvectors, persistence-informed embeddings, coordinates from spectral or random-walk statistics.
- RPEs assign a feature to each pair of elements, encoding their mutual relation (e.g., shortest-path distance, resistance distance, random-walk probability).
It has been established that, up to distinguishing power, APEs and RPEs are interchangeable in expressive capacity for graph transformers. Specifically, for any APE, a DeepSet-synthesized RPE of sufficient form can match its distinguishing power. Conversely, any diagonally aware RPE can be encoded by a 2-order equivariant network as an APE. This equivalence enables transfer of theoretical and practical insights across classes, although computational and memory properties may differ significantly (Black et al., 2024).
Table: Representative Topological Positional Encodings
| Encoding | Type | Key Properties |
|---|---|---|
| Laplacian Eigenvectors | APE | Global, spectral, sign-ambiguous |
| Shortest-Path Distance | RPE | Combinatorial, pairwise |
| Resistance Distance | RPE | Global current-flow sensitivity |
| Persistent Homology | APE | Multiscale cycles/components |
| Random-Walk Return | Both | Local-to-global, time-scale tunable |
| Quantum Correlations | APE | Encodes entanglement/interference patterns |
2. Spectral, Random-Walk, and Diffusion-Based Constructions
The canonical topological positional encodings are constructed from the (normalized) Laplacian, adjacency, or random-walk matrices.
- Laplacian Eigenvector Encodings (LapPE/ESLapPE): Assign to each node or feature the first k Laplacian eigenvector coordinates, with optional sign-invariant pre-processing (e.g., SignNet), and sometimes random sign flipping or MLP-projection to smooth over ambiguities (Grötschla et al., 2024, 2502.01122). These eigenvectors capture global and local graph harmonics.
- Random-Walk and Diffusion Encodings (RWSE, RRWP, PPR): Utilize polynomials of the random-walk matrix R = D⁻¹A to encode for each element its self-return probability or transition profile over K steps, or compute personalized PageRank scores. Full-matrix off-diagonal features (RRWP) form highly expressive edge-wise RPEs (but with heavy memory) (Grötschla et al., 2024).
- Stable and Expressive Spectral Encodings (SPE): Generalize by combining spectral kernels f(λ) applied to eigenvalues, then (optionally) pass the kernel through a 2-order equivariant network (Black et al., 2024).
These approaches are computationally intensive for large graphs due to O(N³) decomposition costs, prompting research into approximations and learnable nonlinear mappings (cf. PEARL), as well as hybrid approaches that mix local neighborhood statistics and global spectral modes (2502.01122).
3. Extensions to Higher-Order, Geometric, and Persistent Topologies
Topological positional encodings have been extended to more general data domains:
- Higher-Order Complexes (HOPSE): Generalize PEs to arbitrary combinatorial complexes—simplicial, cellular, or otherwise—by decomposing the structure into rank-wise Hasse graphs and applying any standard graph PE extraction channel. Aggregated via permutation-equivariant reductions (concat/mean/MLP), this approach provides cell-rank-aware encodings, essential for multi-way and high-dimensional interactions, with linear scalability (Carrasco et al., 21 May 2025).
- Persistent Homology-Informed PE (PiPE): Integrates persistent homology, tracking the birth and death of topological features across filtrations, with standard positional encodings. PiPE alternates learnable PEs with PH-derived coordinates at each layer, yielding a system provably more expressive than either class alone (Verma et al., 6 Jun 2025).
- Geometric Grid and Manifold PE (GeoPE, Geotokens): For grid or manifold-structured data, spatial and manifold topology are encoded directly. GeoPE employs Lie-algebra averaged quaternionic 3D rotations for 2D or 3D grids (effectively restoring spatial adjacency lost by flattening in transformers), while Geotoken embeddings preserve spherical geometry for geo-coordinates via block-diagonal 3D Euler rotations (Yao et al., 4 Dec 2025, Unlu, 2024). These approaches maintain isometry and properly reflect true spatial/metric topology, not sequence artifacts.
4. Structural Estimation for Latent-Topology Tasks
In settings lacking explicit topological structure (e.g., tabular, multimodal, or time-series data), latent structural graphs between features are estimated:
- Feature-Graph-Based PEs (Tab-PET): Computes pairwise associations or causality among features to construct a latent feature graph. Low-frequency Laplacian eigenvectors of this feature graph serve as positional encodings for each input dimension. This imparts positional bias, reduces effective rank through score gaps in attention, and simplifies learning in tabular transformers (Leng et al., 17 Nov 2025).
- Best Practices: Association-based graphs (e.g., Spearman, mutual information) yield more robust PEs than causality-based graphs in low-data regimes. Concatenating fixed (non-learned) PEs to feature embeddings typically outperforms methods that attempt to learn PEs end-to-end when structural cues are weak.
5. Disentanglement, Injectivity, and Expressivity Hierarchies
Several works analyze the interplay between positional and semantic signals, the impact of disentangling, and the strict expressivity properties of topological encodings.
- Disentangled Representations: Explicit isolation of absolute positional, relative positional, and semantic streams in transformers (e.g., DSTG) leads to the condensation of AP into a low-frequency 2D sinusoidal manifold, faithfully encoding coarse document or sequence structure, and improving performance across linguistic tasks relative to fully entangled or standard PEs (RoPE, bucketed RP) (Lequeu et al., 28 May 2026).
- Injectivity and Lossless Encoding: Topological PEs based on shortest-path, adjacency-powers, or spectral coordinates are provably injective under suitable conditions, enabling model-agnostic rewiring of graphs (r-hop expansion or virtual node insertion) while preserving recoverability and original connectivity. This can be used to control receptive field and mitigate over-squashing without architectural change (Brüel-Gabrielsson et al., 2022).
- Expressivity Comparisons:
- Combinatorially aware RPEs (SPD) are strictly more expressive than 1-WL but do not fully capture global structure.
- Resistance distance and full spectral or persistence-based encodings have strictly greater expressive power, e.g., distinguishing graphs that SPD or LapPE cannot (Black et al., 2024, Verma et al., 6 Jun 2025).
- Combined PH+PE (as in PiPE) strictly outperforms standalone PE or PH, achieving separations unattainable by either method individually (Verma et al., 6 Jun 2025).
6. Empirical Impact and Task-Dependent Best Practices
Systematic benchmarks reveal the practical impact of topological positional encodings:
- Transformers for Graphs and Structured Data: Expressive topological PEs (Laplacian-based, random-walk-based, persistence-informed) are essential to match or exceed state-of-the-art performance in molecular, social, and synthetically challenging graph datasets, as well as in peptide and image-superpixel tasks (Grötschla et al., 2024, Verma et al., 6 Jun 2025).
- Receptive Field and Scalability: Expansion to r-hop neighborhoods with lossless positional annotation recovers TDL expressivity with manageable computational cost, and dense PEs can be substituted for learnable PEs in low-data or large-input settings.
- Task-PE Alignment: Laplacian PEs excel in capturing smooth global structure, random-walk statistics are preferred for long-range dependencies or irregular graphs, persistent-homology or hybrid methods become crucial for datasets distinguished by cycles or high-order topology.
- Failure Modes: Axis-independent grid PEs (e.g., vanilla 2D RoPE) fail to couple axes, leading to non-topological “shortcuts” in attention. Coupled approaches (GeoPE, geotokens) rectify this, restoring locality and isometry (Yao et al., 4 Dec 2025, Unlu, 2024).
7. Advanced Constructions: Quantum and Lie-Algebraic PEs
Recent developments expand the definition of topological PE:
- Quantum Positional Encodings (QPEs): QPEs map a graph to the Hamiltonian of an N-qubit quantum system, leveraging both classical and quantum computable correlation patterns—e.g., entanglement in ground or dynamically evolved states (Ising/XY models)—to extract node-level features. Quantum-specific correlations, potentially inaccessible to classical PEs, can confer strictly higher expressive power for certain graph families and demonstrate empirical performance gains, albeit currently relying on classical simulation for tractability (Thabet et al., 2024).
- Lie-Algebra-Averaged Rotational PE (GeoPE): By embedding grid coordinates as commutative averages of axis rotations in the tangent space of SO(3), GeoPE overcomes the non-commutativity and asymmetry of naive sequential rotations, symmetrically coupling spatial axes and ensuring the Euclidean manifold structure is preserved. This enables transformers to correctly mirror geometric proximity and shape-bias, essential for vision tasks and generalizes naturally to higher-dimensional data (Yao et al., 4 Dec 2025).
Topological positional encodings thus span a spectrum from combinatorial graph statistics and spectral methods through persistent homology, to advanced geometric and quantum constructions. Their ability to restore, preserve, or leverage topology in otherwise agnostic models has reshaped the landscape of architectural design for structured data, and their ongoing development continues to drive improvements in expressivity, efficiency, and robustness across domains.