Geometric Alignment Score (GAS)
- Geometric Alignment Score (GAS) is a metric that compares the topological features of real and generated data manifolds using Betti curves and relative living time.
- It leverages computational topology constructs like simplicial complexes and persistent homology to detect GAN failures such as mode collapse and local geometric distortions.
- GAS computes a mean relative living time (MRLT) via witness complexes, offering a robust, topology-based alternative to conventional metrics like Inception Score and FID.
The Geometric Alignment Score (GAS), also known as the Geometry Score, is a quantitative and qualitative metric for comparing generative models, specifically Generative Adversarial Networks (GANs), via the topological properties of the data manifolds they generate. GAS evaluates the degree to which the model-generated sample manifold matches the topological “shape” of the manifold underlying real data, providing a distinctive perspective compared to conventional GAN evaluation metrics centered on perceptual similarity or feature distributions. This framework is grounded in computational algebraic topology and is applicable to datasets of arbitrary nature, including non-image domains (Khrulkov et al., 2018).
1. Manifold Hypothesis and Topological Motivation
In generative modeling, the manifold hypothesis posits that real-world data, such as natural images, resides on a low-dimensional, nonlinear submanifold within high-dimensional ambient space. A GAN generator induces its own model manifold . Traditional metrics, including Inception Score and Fréchet Inception Distance, are contingent on neural network feature extractors and Gaussian approximations, thereby rendering them insensitive to certain structural pathologies such as mode collapse or local geometric distortions.
Topology is uniquely robust to smooth deformations—meaning connected components, loops, and higher-dimensional “holes” persist under continuous transformations—making GAS an invariant measure for comparing real and generated data distributions by their intrinsic shape characteristics. The objective is to detect topological defects introduced by model failures, complementing existing metrics by identifying qualitative discrepancies missed by network-based or distributional comparisons.
2. Mathematical Framework
GAS draws upon four constructs from computational topology: simplicial complexes, the Vietoris–Rips filtration, Betti numbers, and persistent homology.
- Simplicial Complexes: Given a finite vertex set , an abstract simplicial complex is a collection of subsets of that includes every subset of each simplex and every singleton .
- Vietoris–Rips Filtration: For metric space and scale parameter , forms nested complexes as increases, yielding a filtration.
- Betti Numbers: The -th Betti number quantifies the number of independent -dimensional holes (: connected components; : loops).
- Persistent Homology: As varies, holes “appear” (birth ) and “disappear” (death ), forming intervals . The Betti curve counts active -dimensional features at scale .
3. Derivation and Construction of GAS
The metric construction focuses on (loops). The sequence of steps is as follows:
- Betti Curves and Relative Living Time (RLT): The instantaneous Betti-1 count is . For integer ,
computes the proportion of scale where the complex sustains exactly loops.
- Witness Complex and Mean RLT (MRLT): For scalable computation, a small random subset of landmarks is selected uniformly from to build a witness complex. Repeating draws yields
and is a probability distribution over loop counts.
- Geometry Score Computation: Given two datasets (real , generated ), compare their MRLTs using squared distance:
Optionally, use Earth-Mover’s Distance (EMD) for distributional divergence. Lower scores indicate superior topological alignment.
4. Algorithmic Procedure and Computational Complexity
Key parameters and workflow:
- ,
- Number of landmarks (typically 50–100)
- Number of trials (at least –)
- Maximal loop count
- Scale factor for (set as )
The algorithm proceeds:
1 2 3 4 5 6 7 8 9 10 11 |
for t = 1…N_trials: L = random subset of X, size L0 compute pairwise distances d(L, X) ε_max = γ · max_{u, v ∈ L} d(u, v) build witness_complex W for ε ∈ [0, ε_max], dim ≤ 2 compute persistence intervals I1 = {[b_i, d_i]} in H1 form Betti curve β1(ε) from I1 for j = 0…j_max: RLT_t[j] = (Lebesgue measure of {ε: β1(ε) = j}) / ε_max p_X = average of RLT_t over t return GeomScore(p_X_real, p_X_gen) |
Per trial, distance matrix computation is . Witness complex and persistence for dimension ≤ 2 scales subcubically in and is independent of . Overall complexity is linear in and the product times the number of trials.
5. Diagnostic Utility for GAN Evaluation
GAS is designed to detect GAN mode collapse and topological anomalies:
- Mode Collapse: Generated data lacking loops yields MRLT concentrated at , resulting in high geometry score versus real data.
- Partial Collapse or Missing Modes: Shifts in MRLT mass across different bins indicate nuanced topological mismatches.
- Empirical Examples: On synthetic circles with variable loop counts, GAS recovers correct values. In the CelebA “bad-DCGAN,” forced mode collapse yields MRLT peaked at . On MNIST, WGAN-GP achieves MRLT distributions closer to real samples than vanilla WGAN.
This suggests that GAS can flag model-specific failures not captured by perceptual metrics.
6. Implementation and Practical Considerations
Library choices include:
- GUDHI (Python): Fully supports witness complexes and persistent homology.
- Ripser, Dionysus: Efficient for Vietoris–Rips complexes and fast persistence computation, but lack dedicated witness-complex support.
Implementation heuristics:
- Random, uniform landmark selection.
- Sufficient trial repetitions for MRLT stability ().
- scaling via .
- Finely bin Betti curves or compute exact Lebesgue measures for numerical robustness.
- Computational bottleneck is pairwise distance calculation for high data dimensionality; persistent homology computation is negligible for simplex dimension .
7. Relationship to Other Metrics and Combined Usage
A comparison of principal GAN quality metrics is summarized as follows:
| Metric | Measures | Limitations |
|---|---|---|
| Inception Score | Sharpness, diversity | Requires pretrained net; topology-insensitive |
| FID | Feature Gaussian fit | Fast; overlooks topological error |
| Geometry Score/GAS | Topological alignment | No visual fidelity; only ; higher compute cost |
IS and FID characterize perceptual quality and diversity but fail to identify topological pathologies. GAS operates without pretrained nets, is sensitive to mode collapse and topological mismatches, and extends to non-image domains, but does not directly address visual fidelity, is limited to first-order topology, and is computationally intensive.
In practice, joint application of GAS and perceptual metrics offers a more thorough diagnostic, with GAS highlighting topological defects and IS/FID reporting perceptual congruity (Khrulkov et al., 2018).