Papers
Topics
Authors
Recent
2000 character limit reached

Center-Scale Consistency (CSC)

Updated 5 December 2025
  • Center-Scale Consistency (CSC) is a metric that assesses the alignment of data centers and scales by capturing essential topological features.
  • CSC employs computational topology methods, including simplicial complexes and persistent homology, to identify issues like mode collapse.
  • By using robust landmark selection and repeated trials, CSC offers a network-agnostic evaluation of high-dimensional data geometry.

The Geometric Alignment Score (GAS), also referred to as the Geometry Score, is a quantitative metric for comparing the quality and diversity of data generated by generative adversarial networks (GANs) against true data distributions. GAS evaluates the topological and geometric properties of data manifolds by leveraging concepts from computational algebraic topology, such as persistent homology, Betti numbers, and simplicial complexes. Unlike classifier-based or Gaussian-approximation metrics, GAS is network-agnostic and sensitive to topological pathologies such as mode collapse, making it suitable for diverse types of data, including non-visual domains (Khrulkov et al., 2018).

1. Theoretical Foundations and Motivation

GAS is grounded in the manifold hypothesis, which posits that high-dimensional data (such as images) reside on a low-dimensional, nonlinear manifold MdataRD\mathcal{M}_{\mathrm{data}} \subset \mathbb{R}^D. The generative model's output is viewed as a sample from a model-induced manifold Mmodel\mathcal{M}_{\mathrm{model}}. Traditional GAN evaluation metrics, including Inception Score and Fréchet Inception Distance, rely on pretrained classifiers or Gaussian distribution assumptions and are often insensitive to distortions or collapses in the structural "shape" of the data. In contrast, GAS measures the degree of topological equivalence between the sampled "shapes" of the real and generated data, focusing on features invariant under smooth deformations (homeomorphisms), such as the number of connected components and independent loops.

2. Algebraic Topology Primitives in GAS

GAS employs core constructions from computational algebraic topology:

  • Simplicial Complexes: Collections of subsets ("simplices") formed from a finite vertex set Z={z1,...,zn}Z = \{z_1, ..., z_n\}, closed under the operation of taking subsets.
  • Vietoris–Rips Complex and Filtration: For a metric space (X,d)(X, d) and ε0\varepsilon \geq 0, the Vietoris–Rips complex Rε(X)R_\varepsilon(X) comprises those simplices whose vertices are pairwise within distance ε\varepsilon. As ε\varepsilon increases, one obtains a filtration: Rε0(X)Rε1(X)R_{\varepsilon_0}(X) \subset R_{\varepsilon_1}(X) \subset \cdots.
  • Betti Numbers: For a simplicial complex S\mathcal{S}, the kkth Betti number βk(S)=dimHk(S)\beta_k(\mathcal{S}) = \dim H_k(\mathcal{S}) quantifies the number of independent kk-dimensional holes. β0\beta_0 is the number of connected components; β1\beta_1 counts 1-dimensional loops.
  • Persistent Homology: Tracks the “birth” and “death” scale intervals [bi,di][b_i, d_i] of topological features across the filtration, yielding a set of intervals Ik\mathcal{I}_k. The Betti curve βk(ε)={i:biεdi}\beta_k(\varepsilon) = |\{ i : b_i \leq \varepsilon \leq d_i \}| records the number of kk-dimensional features present at scale ε\varepsilon.

3. Geometry Score Construction

The Geometry Score specifically targets the distributions of 1-dimensional holes (β1\beta_1) in the data. Its derivation proceeds as follows:

  • Relative Living Time (RLT): For filtration up to εmax\varepsilon_{\max}, the RLT for exactly jj loops is defined as

RLT(j;X)=Lebesgue{ε[0,εmax]:β1(ε)=j}εmax\mathrm{RLT}(j; X) = \frac{\mathrm{Lebesgue}\{\varepsilon \in [0, \varepsilon_{\max}]: \beta_1(\varepsilon) = j\}}{\varepsilon_{\max}}

This defines a probability mass function over loop counts as the proportion of parameter space (ε\varepsilon values) for which the data has exactly jj loops.

  • Landmark Selection and Mean RLT: To handle large datasets efficiently, a small random subset of points LXL \subset X ("landmarks") is selected, and RLT is computed for each choice. Repeating the process NN times yields the Mean RLT (MRLT), pX(j)=EL[RLT(j;X,L)]p_X(j) = \mathbb{E}_L[\mathrm{RLT}(j; X, L)].
  • Geometry Score (GAS): The degree of geometric alignment between two datasets X1X_1 (real) and X2X_2 (generated) is given by the squared L2L_2 distance between their MRLT distributions:

GeomScore(X1,X2)=j=0jmax[pX1(j)pX2(j)]2\mathrm{GeomScore}(X_1, X_2) = \sum_{j=0}^{j_{\max}} [p_{X_1}(j) - p_{X_2}(j)]^2

Alternatively, the Earth-Mover’s Distance (EMD) between these distributions can be employed. Lower scores indicate greater topological similarity (Khrulkov et al., 2018).

4. Algorithmic Workflow and Complexity

The high-level algorithm for computing GAS is as follows:

  1. Sample L0L_0 landmarks from the dataset XX.
  2. Construct the witness complex on these landmarks, calculating pairwise distances and setting the filtration range εmax=γmaxu,vLd(u,v)\varepsilon_{\max} = \gamma \cdot \max_{u,v \in L} d(u,v), typically with γ1/100\gamma \approx 1/100.
  3. Compute persistent homology up to dimension 1 (loops), extract persistence intervals, and form the Betti curve β1(ε)\beta_1(\varepsilon).
  4. For each possible loop count j{0,...,jmax}j \in \{0, ..., j_{\max}\}, compute RLT.
  5. Repeat the above for NtrialsN_{\mathrm{trials}} random landmark samples, averaging to obtain MRLT.
  6. Calculate the score (squared L2L_2 or EMD) between the MRLTs of real and generated data.

The computational bottleneck is the O(NL0D)O(N \cdot L_0 \cdot D) cost for computing all landmark–data distances. Persistent homology in low dimension over modestly sized complexes (determined by L050L_0 \approx 50–$100$) is generally subcubic in L0L_0 and independent of data dimension DD.

Step Computational Cost Notes
Distances O(NL0D)O(N \cdot L_0 \cdot D) Dominant for high DD
Witness complex + PH Subcubic in L0L_0 Fast for dim 2\leq 2
Trials Multiply all above by NN N=103N = 10^310410^4 recommended

5. Diagnostic Power and Applicability

GAS detects various GAN failure modes:

  • Mode Collapse: Generated data lacking loops (β1=0\beta_1 = 0 for all ε\varepsilon) concentrates MRLT at j=0j=0, yielding high divergence from real data's MRLT.
  • Partial Mode Collapse or Missing Modes: Manifest as distinctive shifts in MRLT mass across j0j \neq 0 bins.
  • Empirical Examples: The metric recovers the correct loop counts in synthetic datasets (e.g., circles with known numbers of loops), reveals mode collapse in CelebA with “bad-DCGAN,” and differentiates GAN variants (e.g., WGAN-GP vs. vanilla WGAN on MNIST) (Khrulkov et al., 2018).

A plausible implication is that GAS enables unsupervised detection of generative model failures that are invisible to traditional classifier-based metrics. Its applicability extends to non-image data, provided a meaningful metric space is defined.

6. Practical Implementation Considerations

Implementation guidelines specify:

  • Software: GUDHI (Python) for witness complexes and persistent homology; Ripser and Dionysus for Vietoris–Rips complexes but without witness functionality.
  • Landmarks: Uniform random sampling, L050L_0 \approx 50–$100$ suffices.
  • Trials: Ntrials103N_\mathrm{trials} \gtrsim 10^310410^4 are needed for MRLT stability.
  • Epsilon scaling: εmax=γmaxu,vLd(u,v)\varepsilon_{\max} = \gamma \cdot \max_{u,v \in L} d(u,v), with γ1/100\gamma \approx 1/100.
  • Numerical accuracy: Either fine binning of Betti curves or exact Lebesgue measure from persistence intervals.
  • Compute time: Dominated by distance calculations for high-dimensional data, with persistent homology computation remaining efficient due to low simplex dimensions (Khrulkov et al., 2018).

7. Comparison with Other GAN Metrics

GAS differs categorically from Inception Score (IS) and Fréchet Inception Distance (FID):

Metric Basis Pros Cons
IS Classifier Fast; measures sharpness & diversity Requires pretrained net; ignores topology
FID Gaussian fit Cheap; easy to use Can miss topological mismatches
GAS Topology Network-free; reveals topological failure; general Ignores pixel-level fidelity; more expensive (PH)

GAS is orthogonal to IS and FID: while IS/FID capture perceptual fidelity and broad diversity, GAS quantifies topological structure and can be combined with these metrics to yield a multifaceted evaluation. GAS provides direct evaluation on the structure of the data manifold, which is crucial for applications sensitive to data topology, but does not measure sample quality in terms of realism or human perception (Khrulkov et al., 2018).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)
Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Center-Scale Consistency (CSC).