Skew Inception Distance (SID) Explained
- Skew Inception Distance (SID) is a metric that extends FID by incorporating third-moment (skewness) information to capture non-Gaussian features in image synthesis.
- It computes discrepancies between real and generated distributions using means, covariances, and coskewness tensors, optimized via PCA for efficient calculation.
- SID aligns more closely with human perception by detecting perceptually meaningful distortions, making it a robust tool for assessing generative models.
Skew Inception Distance (SID) is a statistical metric designed to evaluate the quality of feature distributions produced by generative models, notably Generative Adversarial Networks (GANs). SID explicitly incorporates third-moment (skewness) information in feature space, thereby extending the well-established Fréchet Inception Distance (FID)—which considers only first and second moments. Originating in the context of image synthesis, SID is motivated by the observation that FID’s Gaussian assumption misses non-Gaussian structure present in real-world data. SID is rigorously defined, admits an efficient practical implementation via dimensionality reduction, and exhibits empirical properties distinct from FID, sometimes aligning more closely with human perceptual judgments (Luzi et al., 2023).
1. Mathematical Definition
Let and be feature vectors extracted—typically from the penultimate layer of an Inception-v3 network—from real and generated images, respectively. SID compares the empirical distributions of these sets through their first three moments:
- Means: ,
- Covariances: , similarly for
- Coskewness tensors: , entries , where and analogously for
The full Skew Inception Distance is then:
where is applied elementwise to normalize units ("cube-root normalization") (Luzi et al., 2023). For the third term, the Frobenius norm is used.
SID is thus FID augmented by a non-Gaussian skewness component, allowing it to detect discrepancies in higher-order moments between real and generated distributions.
2. Metricity and Pseudometric Properties
SID defines a metric on the space of distributions determined by their first three moments. If the mapping is injective (moments characterize the distribution), SID is a true metric; otherwise, it is a pseudometric—i.e., SID is possible for distinct and sharing the first three moments (Luzi et al., 2023). This distinction ensures SID’s validity as a quantitative tool but highlights that equality of SID does not guarantee full distributional equality unless higher moments match or are irrelevant in the application domain.
3. Practical Computation and PCA Acceleration
Direct computation of the coskewness term in high dimensions is prohibitive— time and memory. For typical feature space with , this requires up to 64 GB RAM for a single tensor. To address this, dimensionality reduction via principal component analysis (PCA) is performed:
- Fit PCA on real feature vectors , obtaining top principal axes.
- Project and onto these axes, reducing both to dimensions.
- Optionally, rescale covariance to match the full-trace energy.
- Compute mean, covariance, and coskewness in -dimensional space.
- Assemble SID using the projected moments (Luzi et al., 2023).
This allows SID, including the skewness term, to be computed in seconds on standard hardware when . PCA also leaves FID’s trace term tractable. Robustness checks indicate that the skewness deviations persist after PCA, even down to (Luzi et al., 2023).
4. Behavior Under Distortions and Empirical Performance
SID has been empirically analyzed on Inception-v3 features of ImageNet data, as well as other datasets and architectures. Key findings include:
- FID increases linearly with added Gaussian noise, even where perturbations are imperceptible to humans.
- The skewness component of SID remains near zero for small, imperceptible noise levels, increasing only when distortions become visible (Luzi et al., 2023).
- This suggests SID’s sensitivity to perceptually meaningful—but not purely statistical—differences, sometimes aligning more closely with observer judgments than FID.
- SID’s empirical stability is demonstrated across different feature extractors and is robust to PCA-based dimensionality reduction.
5. Applications Beyond GAN Evaluation
Although motivated by the limitations of Gaussian-based metrics in GAN assessment, SID is generally applicable to any learning scenario where feature distributions are compared:
- Evaluation of other generative models (e.g., diffusion models)
- Out-of-distribution detection
- Few-shot learning, where feature normality is often (implicitly or explicitly) assumed
Extensions include replacing PCA with random projections for further computational gains and substituting alternative skewness metrics (e.g., Mardia’s, Kollo’s) provided an appropriate embedding into a metric space is possible (Luzi et al., 2023).
6. Computational Complexity, Limitations, and Extensions
The main computational hurdle in SID is the coskewness calculation. Without dimensionality reduction, memory and computation are prohibitive. With PCA to , the entire process requires approximately 4.7 seconds on CPU and 0.02 seconds on GPU, using roughly 128 MB RAM (Luzi et al., 2023). Limitations include:
- SID inherits FID’s bias when sample sizes are small ().
- SID may disregard non-moment-based differences; distinct distributions sharing first three moments yield SID.
- The cube-root normalization and optional scaling introduce tunable hyperparameters.
- Projecting via PCA may discard information in lower-variance modes.
Potential extensions include using alternative third-moment distances and extending SID to new use-cases or domains with non-Gaussian feature distributions (Luzi et al., 2023).
7. Relationship to Other Moment-Based Metrics
SID generalizes FID by addressing its main limitation: the latter’s reduction of all discrepancy to mean and covariance (two moments), justified only if features are Gaussian. Empirically, many real-data features are strongly non-Gaussian, as evidenced by statistical tests for skewness post-PCA (Luzi et al., 2023). By incorporating skewness, SID provides a more discriminating and nuanced measure for matching real and generated distributions, especially when evaluating visual fidelity and diversity under practical and perceptually relevant distortions.