Topological Orthogonality Overview
- Topological orthogonality is defined by using topological properties to certify disjointness, incompatibility, or decoupling across various mathematical contexts.
- It applies in settings from vector spaces with continuous function families to subset closure relations, persistence diagram comparisons, and graph representation bounds.
- Methods include characterizing orthogonality via extremal functionals, cosine similarity zeroing in data analysis, and topological lower bounds in combinatorial settings.
Topological orthogonality denotes several constructions in which orthogonality is induced, constrained, or interpreted by topological data. In one line of work it is an orthogonality relation on a real vector space equipped only with a topology and a family of continuous scalar-valued functions; in another it is a relation on subsets derived from closure, proximity, uniformity, or coarse structure; elsewhere it appears as the vanishing of cosine similarity between persistence diagrams, as a topological mechanism for bounding orthogonal representations of graphs, and as a dynamical non-correlation condition such as Möbius orthogonality. Current arXiv usage therefore suggests a family of mathematically distinct notions linked by the common role of topology in certifying disjointness, incompatibility, or decoupling (Sain et al., 2019, Dydak, 2018, Nordin et al., 6 Apr 2025, Attias et al., 2021, Aaronson et al., 23 Apr 2026).
1. Topology-induced orthogonality in vector spaces
The paper "Orthogonality in a vector space with a topology And a generalization of Bhatia-Semrl Theorem" introduces an orthogonality relation on an arbitrary real vector space equipped with a topology , without requiring that make a topological vector space (Sain et al., 2019). The construction uses three ingredients: the topology , a family of -valued -continuous functions, and a -admissible set 0, where 1 is the projective equivalence relation on 2 defined by
3
A subset 4 is 5-admissible if it contains exactly one nonzero vector from each 6-equivalence class.
For 7, the relation
8
holds if there exists 9 such that 0 and 1 for all 2. For arbitrary nonzero 3, one declares
4
where 5 is the chosen representative of the line 6. Everything is orthogonal to 7, and 8 is orthogonal to everything. The triple 9 is called an orthogonality space.
A central result is that Birkhoff–James orthogonality is recovered as a special case. If 0 is a Banach space, 1 is the norm topology, 2, and 3 is a 4-admissible slice of the unit sphere, then
5
where 6 means 7 for all 8. The same recovery remains valid for the weak topology on a Banach space with 9, and for perfectly normal topologies one may take 0 to be all strictly-separating 1-continuous functions. At the opposite extreme, if 2 contains the zero function, or if 3 is discrete and 4 is arbitrary, the induced relation is the trivial full relation 5 for all 6.
The paper also characterizes right additivity. Under the hypotheses that 7 is a family of nonzero continuous linear functionals on 8 and no two members of 9 are positive or negative multiples of one another,
0
holds if and only if for each 1 there is at most one 2 with 3. Specializing again to Banach spaces yields the classical equivalence between right additivity of Birkhoff–James orthogonality and smoothness of the space.
In finite-dimensional operator theory, the same framework yields a topological generalization of the Bhatia–Šemrl theorem. For 4, with 5 finite-dimensional and 6 topologized by finitely many seminorms 7, the paper characterizes orthogonality 8 by the existence of 9 and 0 such that 1, 2, and 3. The proof proceeds through an analogue of James’s lemma for 4 and a compactness-and-separation argument on 5. This places classical norm-based operator orthogonality inside a broader topological extremal-functional formalism.
2. Orthogonality relations on subsets and morphisms
A different tradition, developed by Dydak, treats orthogonality as a primitive relation on subsets of a set 6, and uses it to unify small-scale and large-scale geometry (Dydak, 2018). The starting point is a symmetric map
7
that is “bi-linear” in the sense that 8 and 9. When 0 is basic, meaning that it takes only the values 1 and 2, one defines
3
Conversely, any symmetric relation on subsets satisfying the corresponding monotonicity axioms determines such a basic dot-product.
Within this framework, the classical topological instance is
4
with dot-product
5
Here the Kuratowski closure operator 6 is viewed as an idempotent projection satisfying
7
Dydak also defines normal, or Tietze, orthogonality. If 8, normality requires the existence of 9 with
0
This enables a parallel–perpendicular decomposition analogous to linear algebra: 1 where
2
The significance of this viewpoint is that the same formalism captures topological orthogonality, proximity spaces, uniform spaces, and large-scale constructions such as metric coarse orthogonality, Higson-corona orthogonality, Gromov-hyperbolic orthogonality, and Freudenthal orthogonality. It also supports 3-large-scale compactifications that recover the Čech–Stone compactification, Samuel–Smirnov compactification, Freudenthal compactification, Higson corona, and Gromov boundary.
A categorical reformulation appears in "A naive diagram-chasing approach to formalisation of tame topology" (Gavrilovich et al., 2018). There orthogonality is Quillen-style lifting orthogonality of morphisms: for arrows 4 and 5,
6
means that every commutative square with 7 on the left and 8 on the right admits a diagonal filler. Iterated left and right orthogonals of simple generating maps recover standard properties. For example, surjections are 9, injections are 0, connected spaces are characterized by orthogonality to the collapse map 1, and similar constructions describe total disconnectedness, dense image, induced topology, 2, 3, Hausdorffness, and compactness. In that setting topological and uniform spaces are represented as simplicial objects in the category of filters. This suggests that, beyond subset disjointness, orthogonality can serve as an abstract logical operator encoding separation and extension principles.
3. Persistence diagrams and perfect topological dissimilarity
In topological data analysis, "On the cosine similarity and orthogonality between persistence diagrams" introduces an orthogonality notion for persistence diagrams based on persistence landscapes (Nordin et al., 6 Apr 2025). If 4 is a non-empty persistence diagram, its persistence-landscape transform is
5
where each 6 is the 7-th landscape layer. On the image of 8, the paper defines
9
00
and the cosine similarity
01
By Cauchy–Schwarz, 02. Orthogonality is defined by
03
The paper proves an equivalent interval-disjointness criterion: 04 Thus orthogonality means that every open birth–death interval from one diagram is disjoint from every open birth–death interval from the other. The relation is symmetric and invariant under re-ordering of diagram points.
This orthogonality is stronger than separation by bottleneck or Wasserstein distances. If 05, then the trivial matching is a perfect matching for both the bottleneck distance 06 and the 07-Wasserstein distance 08, and one obtains
09
10
At the same time, the paper emphasizes that 11 and 12 can be arbitrarily small even if supports are disjoint, so orthogonality is not equivalent to large metric distance. A common misconception is therefore that orthogonal persistence diagrams must be metrically far apart; the cited examples show that this need not hold.
The paper also gives an explicit orthogonal family. For
13
14
all intervals in 15 lie below those in 16, so every interval pair is disjoint and 17.
For computation, the paper describes the following pipeline: build a Vietoris–Rips filtration from a finite point cloud and compute a persistence diagram 18; transform 19, truncating when 20; approximate the integrals by quadrature on the piecewise-linear graph; compute norms and inner products; and decide orthogonality when 21 is below a numerical threshold 22. In experiments on point-clouds sampled from a disk 23, an annulus 24, and a circle 25, the cosine distance 26 separated 27 vs. 28 with 29 and 30 vs. 31 with 32, whereas 33 and 34 could not reliably do so. The method inherits shortcomings of persistence landscapes, including sensitivity to outliers, and numerical integration may create small nonzero inner products for nearly orthogonal diagrams.
4. Graph orthogonal representations and topological lower bounds
In graph theory, orthogonality is attached to vector assignments on vertices, and topology enters through Borsuk–Ulam-type lower-bound arguments. Haviv defines a 35-dimensional orthogonal representation of a graph 36 over 37 as an assignment 38 such that distinct non-adjacent vertices receive orthogonal vectors, and the orthogonality dimension 39 is the minimum such 40 (Haviv, 2018). The paper proves general lower bounds on 41 using the Borsuk–Ulam theorem, especially for complements of generalized Kneser graphs.
For a set-system 42, the complement 43 of the generalized Kneser graph satisfies
44
where 45 is the 46-colorability-defect. A geometric form of the bound uses configurations 47 such that every open hemisphere contains the points of some 48, yielding
49
For ordinary Kneser graphs 50, one recovers
51
matching Lovász’s lower bound for chromatic number. Similar statements are obtained for Schrijver graphs and Borsuk graphs.
The paper "Local Orthogonality Dimension" shifts attention from ambient dimension to locality (Attias et al., 2021). There an orthogonal representation of a graph 52 over 53 is an assignment 54 with 55 for every vertex and 56 whenever 57. This reflects a complement-based change of convention. The locality of a representation is
58
and the local orthogonality dimension 59 is the minimum possible locality.
Topological methods again yield lower bounds. If a topological method implies 60 for a graph 61 with at least one edge, then
62
over every field. The proof uses the stronger fact of Alishahi–Meunier that any independent representation of a topologically 63-chromatic graph contains a copy of 64 whose two sides are linearly independent. In some families this lower bound is tight, notably for Schrijver graphs. In others the local orthogonality dimension over 65 equals the chromatic number: for every complement of a line graph,
66
The parameter also has algorithmic significance. For every fixed 67 and any field 68, deciding whether 69 is 70-hard. In index coding one has
71
over 72, so local orthogonality dimension furnishes upper bounds on optimum linear index-coding length. This makes topological orthogonality relevant not only to extremal graph theory but also to information theory and quantum one-round communication complexity.
5. Dynamical orthogonality and Möbius non-correlation
In topological dynamics, orthogonality refers to the vanishing of correlations between an orbit and an arithmetic or bounded sequence. Karagulyan defines topological Möbius orthogonality for a system 73, with 74 a compact metric space and 75 a homeomorphism, by the condition that for every 76 and every 77,
78
where 79 is the classical Möbius function (Karagulyan, 2017). Sarnak’s conjecture predicts that this holds whenever the topological entropy vanishes.
The main theorem of that paper shows that Möbius orthogonality fails for subshifts of finite type with positive topological entropy. More precisely, if 80 is a subshift of finite type with 81, then there exist 82 and 83 such that
84
Via Katok’s horseshoe theorem, every 85 surface diffeomorphism with positive entropy also fails to be orthogonal to the Möbius function. The proof uses a specification-type loop-concatenation construction and arithmetic progressions with positive density of square-free integers.
The paper "Unveiling universality, encloseness, and orthogonality in dynamics" generalizes this perspective from the Möbius function to an arbitrary bounded sequence 86 with mean zero (Aaronson et al., 23 Apr 2026). It defines Cesàro orthogonality 87 by
88
and logarithmic orthogonality 89 by the analogous logarithmic average. A stronger notion is the strong 90-MOMO property: 91 for every 92, every sequence 93, and every increasing sequence 94 with 95. The paper states that strong-MOMO implies orthogonality, and that 96 is equivalent to all uniquely ergodic factors of 97 enjoying strong-MOMO.
A principal lifting theorem says that if 98 has the strong 99-MOMO property and 00 is any topological system such that for each ergodic 01 there exists an ergodic 02 with 03 isomorphic to 04, then 05. This motivates universal topological models for characteristic classes of measure-preserving systems. For the class 06 of automorphisms whose ergodic components have pure discrete spectrum, the paper constructs a universal model on
07
It also proves that if the union of all measure-theoretic eigenvalues of a zero-entropy system 08 is countable, then Sarnak’s conjecture holds along a subsequence of full logarithmic density. A common source of confusion is that orthogonality in this literature is not geometric disjointness but cancellation of orbit-sequence correlations; the relevant topology is the topology of the dynamical model.
6. Operator theory, topological phases, and machine-learning recontextualizations
Several recent works use topological orthogonality language in more specialized ways. In "Orthogonality of bilinear forms and application to matrices," Roy, Senapati, and Sain characterize Birkhoff–James orthogonality in the Banach space 09, where 10 is a compact topological space and 11 a real normed space (Roy et al., 2024). For 12, with
13
and cones
14
the characterization is
15
If 16 is connected, this reduces to a single-point condition: 17 Applied to real bilinear forms and matrices, this yields an elementary proof of the real Bhatia–Šemrl theorem: for real matrices 18, 19 iff there exists a unit vector 20 such that 21 and 22. Here compactness of the topological domain is what guarantees norm attainment and hence a finite orthogonality test set.
In topological phases of matter, "Anderson orthogonality catastrophe in 23-D topological systems" studies the overlap
24
between many-body ground states and shows a universal topological response term in its finite-size scaling (Gu, 2019). At fixed points of 25-dimensional topological orders,
26
Here 27 is the Euler characteristic and 28 is the central charge of the boundary CFT. For Laughlin wave functions, the paper finds a stronger leading behavior,
29
with 30 and 31 on the disk, and a corresponding sphere formula without the 32 term. The leading 33 gives decay faster than exponential. In this context, topological orthogonality refers to universal topological structure in overlap scaling rather than to an explicitly defined bilinear relation.
A further recontextualization appears in "MUSE: Resolving Manifold Misalignment in Visual Tokenization via Topological Orthogonality" (Yang et al., 7 May 2026). There topological orthogonality is a design principle for decoupling structural and semantic objectives in Transformers. Let 34 be a structural loss, 35 a semantic loss, 36 the attention-topology parameters, and 37 the feature-value parameters. The orthogonality requirement is
38
or, in shared-parameter form,
39
The architecture separates a topology stream
40
from a semantic stream
41
with stop-gradient operators to prevent cross-contamination. In experiments, the paper reports 42, linear probing 43 versus 44 for the InternViT-300M teacher, and structural 45. Ablations show that removing topology loss destroys geometry, while removing semantic anchoring yields “semantic blindness” with zero-shot 46. Figure 1 reports a change in gradient cosine from 47 in naive shared training to 48 in MUSE. This suggests a modern computational usage in which “topological orthogonality” no longer refers to classical geometric orthogonality of vectors or sets, but to orthogonal routing of learning signals through topology-sensitive and semantics-sensitive parameter subspaces.
Across these settings, the unifying pattern is not a single invariant formula but a recurrent structural role: topology identifies when two entities should be treated as independent, non-overlapping, or non-interfering. In some cases this is literal closure disjointness or interval separation; in others it is non-correlation, locality obstruction, universal finite-size response, or architectural gradient decoupling. The phrase therefore functions less as a single doctrine than as a cross-disciplinary template for imposing orthogonality through topological structure.