Center-Scale Consistency (CSC)
- Center-Scale Consistency (CSC) is a metric that assesses the alignment of data centers and scales by capturing essential topological features.
- CSC employs computational topology methods, including simplicial complexes and persistent homology, to identify issues like mode collapse.
- By using robust landmark selection and repeated trials, CSC offers a network-agnostic evaluation of high-dimensional data geometry.
The Geometric Alignment Score (GAS), also referred to as the Geometry Score, is a quantitative metric for comparing the quality and diversity of data generated by generative adversarial networks (GANs) against true data distributions. GAS evaluates the topological and geometric properties of data manifolds by leveraging concepts from computational algebraic topology, such as persistent homology, Betti numbers, and simplicial complexes. Unlike classifier-based or Gaussian-approximation metrics, GAS is network-agnostic and sensitive to topological pathologies such as mode collapse, making it suitable for diverse types of data, including non-visual domains (Khrulkov et al., 2018).
1. Theoretical Foundations and Motivation
GAS is grounded in the manifold hypothesis, which posits that high-dimensional data (such as images) reside on a low-dimensional, nonlinear manifold . The generative model's output is viewed as a sample from a model-induced manifold . Traditional GAN evaluation metrics, including Inception Score and Fréchet Inception Distance, rely on pretrained classifiers or Gaussian distribution assumptions and are often insensitive to distortions or collapses in the structural "shape" of the data. In contrast, GAS measures the degree of topological equivalence between the sampled "shapes" of the real and generated data, focusing on features invariant under smooth deformations (homeomorphisms), such as the number of connected components and independent loops.
2. Algebraic Topology Primitives in GAS
GAS employs core constructions from computational algebraic topology:
- Simplicial Complexes: Collections of subsets ("simplices") formed from a finite vertex set , closed under the operation of taking subsets.
- Vietoris–Rips Complex and Filtration: For a metric space and , the Vietoris–Rips complex comprises those simplices whose vertices are pairwise within distance . As increases, one obtains a filtration: .
- Betti Numbers: For a simplicial complex , the th Betti number quantifies the number of independent -dimensional holes. is the number of connected components; counts 1-dimensional loops.
- Persistent Homology: Tracks the “birth” and “death” scale intervals of topological features across the filtration, yielding a set of intervals . The Betti curve records the number of -dimensional features present at scale .
3. Geometry Score Construction
The Geometry Score specifically targets the distributions of 1-dimensional holes () in the data. Its derivation proceeds as follows:
- Relative Living Time (RLT): For filtration up to , the RLT for exactly loops is defined as
This defines a probability mass function over loop counts as the proportion of parameter space ( values) for which the data has exactly loops.
- Landmark Selection and Mean RLT: To handle large datasets efficiently, a small random subset of points ("landmarks") is selected, and RLT is computed for each choice. Repeating the process times yields the Mean RLT (MRLT), .
- Geometry Score (GAS): The degree of geometric alignment between two datasets (real) and (generated) is given by the squared distance between their MRLT distributions:
Alternatively, the Earth-Mover’s Distance (EMD) between these distributions can be employed. Lower scores indicate greater topological similarity (Khrulkov et al., 2018).
4. Algorithmic Workflow and Complexity
The high-level algorithm for computing GAS is as follows:
- Sample landmarks from the dataset .
- Construct the witness complex on these landmarks, calculating pairwise distances and setting the filtration range , typically with .
- Compute persistent homology up to dimension 1 (loops), extract persistence intervals, and form the Betti curve .
- For each possible loop count , compute RLT.
- Repeat the above for random landmark samples, averaging to obtain MRLT.
- Calculate the score (squared or EMD) between the MRLTs of real and generated data.
The computational bottleneck is the cost for computing all landmark–data distances. Persistent homology in low dimension over modestly sized complexes (determined by –$100$) is generally subcubic in and independent of data dimension .
| Step | Computational Cost | Notes |
|---|---|---|
| Distances | Dominant for high | |
| Witness complex + PH | Subcubic in | Fast for dim |
| Trials | Multiply all above by | – recommended |
5. Diagnostic Power and Applicability
GAS detects various GAN failure modes:
- Mode Collapse: Generated data lacking loops ( for all ) concentrates MRLT at , yielding high divergence from real data's MRLT.
- Partial Mode Collapse or Missing Modes: Manifest as distinctive shifts in MRLT mass across bins.
- Empirical Examples: The metric recovers the correct loop counts in synthetic datasets (e.g., circles with known numbers of loops), reveals mode collapse in CelebA with “bad-DCGAN,” and differentiates GAN variants (e.g., WGAN-GP vs. vanilla WGAN on MNIST) (Khrulkov et al., 2018).
A plausible implication is that GAS enables unsupervised detection of generative model failures that are invisible to traditional classifier-based metrics. Its applicability extends to non-image data, provided a meaningful metric space is defined.
6. Practical Implementation Considerations
Implementation guidelines specify:
- Software: GUDHI (Python) for witness complexes and persistent homology; Ripser and Dionysus for Vietoris–Rips complexes but without witness functionality.
- Landmarks: Uniform random sampling, –$100$ suffices.
- Trials: – are needed for MRLT stability.
- Epsilon scaling: , with .
- Numerical accuracy: Either fine binning of Betti curves or exact Lebesgue measure from persistence intervals.
- Compute time: Dominated by distance calculations for high-dimensional data, with persistent homology computation remaining efficient due to low simplex dimensions (Khrulkov et al., 2018).
7. Comparison with Other GAN Metrics
GAS differs categorically from Inception Score (IS) and Fréchet Inception Distance (FID):
| Metric | Basis | Pros | Cons |
|---|---|---|---|
| IS | Classifier | Fast; measures sharpness & diversity | Requires pretrained net; ignores topology |
| FID | Gaussian fit | Cheap; easy to use | Can miss topological mismatches |
| GAS | Topology | Network-free; reveals topological failure; general | Ignores pixel-level fidelity; more expensive (PH) |
GAS is orthogonal to IS and FID: while IS/FID capture perceptual fidelity and broad diversity, GAS quantifies topological structure and can be combined with these metrics to yield a multifaceted evaluation. GAS provides direct evaluation on the structure of the data manifold, which is crucial for applications sensitive to data topology, but does not measure sample quality in terms of realism or human perception (Khrulkov et al., 2018).