Skew Inception Distance (SID) Explained

Updated 27 January 2026

Skew Inception Distance (SID) is a metric that extends FID by incorporating third-moment (skewness) information to capture non-Gaussian features in image synthesis.
It computes discrepancies between real and generated distributions using means, covariances, and coskewness tensors, optimized via PCA for efficient calculation.
SID aligns more closely with human perception by detecting perceptually meaningful distortions, making it a robust tool for assessing generative models.

Skew Inception Distance (SID) is a statistical metric designed to evaluate the quality of feature distributions produced by generative models, notably Generative Adversarial Networks (GANs). SID explicitly incorporates third-moment (skewness) information in feature space, thereby extending the well-established Fréchet Inception Distance (FID)—which considers only first and second moments. Originating in the context of image synthesis, SID is motivated by the observation that FID’s Gaussian assumption misses non-Gaussian structure present in real-world data. SID is rigorously defined, admits an efficient practical implementation via dimensionality reduction, and exhibits empirical properties distinct from FID, sometimes aligning more closely with human perceptual judgments (Luzi et al., 2023).

1. Mathematical Definition

Let $X_1, \ldots, X_n \in \mathbb{R}^d$ and $Y_1, \ldots, Y_m \in \mathbb{R}^d$ be feature vectors extracted—typically from the penultimate layer of an Inception-v3 network—from real and generated images, respectively. SID compares the empirical distributions of these sets through their first three moments:

Means: $\mu_p = (1/n)\sum_{i=1}^n X_i$ , $\mu_q = (1/m)\sum_{j=1}^m Y_j$
Covariances: $\Sigma_p = (1/(n-1))\sum_{i=1}^n (X_i - \mu_p)(X_i - \mu_p)^\top$ , similarly for $\Sigma_q$
Coskewness tensors: $s_p \in \mathbb{R}^{d\times d\times d}$ , entries $(s_p)_{ijk} = (1/n)\sum_{\ell=1}^n X^*_{\ell,i} X^*_{\ell,j} X^*_{\ell,k}$ , where $X^* = \Sigma_p^{-1/2}(X - \mu_p)$ and analogously for $s_q$

The full Skew Inception Distance is then:

$\mathrm{SID}(P, Q) = \sqrt{ \| \mu_p - \mu_q \|_2^2 + \mathrm{Tr}(\Sigma_p + \Sigma_q - 2(\Sigma_p \Sigma_q)^{1/2}) + \| \alpha(s_p) - \alpha(s_q) \|_F^2 }$

where $\alpha(x) = x^{1/3}$ is applied elementwise to normalize units ("cube-root normalization") (Luzi et al., 2023). For the third term, the Frobenius norm is used.

SID is thus FID augmented by a non-Gaussian skewness component, allowing it to detect discrepancies in higher-order moments between real and generated distributions.

2. Metricity and Pseudometric Properties

SID defines a metric on the space of distributions determined by their first three moments. If the mapping $P \mapsto (\mu_p, \Sigma_p, s_p)$ is injective (moments characterize the distribution), SID is a true metric; otherwise, it is a pseudometric—i.e., SID $(P, Q) = 0$ is possible for distinct $P$ and $Q$ sharing the first three moments (Luzi et al., 2023). This distinction ensures SID’s validity as a quantitative tool but highlights that equality of SID does not guarantee full distributional equality unless higher moments match or are irrelevant in the application domain.

3. Practical Computation and PCA Acceleration

Direct computation of the coskewness term in high dimensions is prohibitive— $O(d^3)$ time and memory. For typical feature space with $d=2048$ , this requires up to 64 GB RAM for a single tensor. To address this, dimensionality reduction via principal component analysis (PCA) is performed:

Fit PCA on real feature vectors $X$ , obtaining top $k \ll d$ principal axes.
Project $X$ and $Y$ onto these axes, reducing both to $k$ dimensions.
Optionally, rescale covariance to match the full-trace energy.
Compute mean, covariance, and coskewness in $k$ -dimensional space.
Assemble SID using the projected moments (Luzi et al., 2023).

This allows SID, including the skewness term, to be computed in seconds on standard hardware when $k\leq 256$ . PCA also leaves FID’s trace term tractable. Robustness checks indicate that the skewness deviations persist after PCA, even down to $k=16$ (Luzi et al., 2023).

4. Behavior Under Distortions and Empirical Performance

SID has been empirically analyzed on Inception-v3 features of ImageNet data, as well as other datasets and architectures. Key findings include:

FID increases linearly with added Gaussian noise, even where perturbations are imperceptible to humans.
The skewness component of SID remains near zero for small, imperceptible noise levels, increasing only when distortions become visible (Luzi et al., 2023).
This suggests SID’s sensitivity to perceptually meaningful—but not purely statistical—differences, sometimes aligning more closely with observer judgments than FID.
SID’s empirical stability is demonstrated across different feature extractors and is robust to PCA-based dimensionality reduction.

5. Applications Beyond GAN Evaluation

Although motivated by the limitations of Gaussian-based metrics in GAN assessment, SID is generally applicable to any learning scenario where feature distributions are compared:

Evaluation of other generative models (e.g., diffusion models)
Out-of-distribution detection
Few-shot learning, where feature normality is often (implicitly or explicitly) assumed

Extensions include replacing PCA with random projections for further computational gains and substituting alternative skewness metrics (e.g., Mardia’s, Kollo’s) provided an appropriate embedding into a metric space is possible (Luzi et al., 2023).

6. Computational Complexity, Limitations, and Extensions

The main computational hurdle in SID is the coskewness calculation. Without dimensionality reduction, memory and computation are prohibitive. With PCA to $k=256$ , the entire process requires approximately 4.7 seconds on CPU and 0.02 seconds on GPU, using roughly 128 MB RAM (Luzi et al., 2023). Limitations include:

SID inherits FID’s bias when sample sizes are small ( $n, m < 50\,000$ ).
SID may disregard non-moment-based differences; distinct distributions sharing first three moments yield SID $=0$ .
The cube-root normalization and optional scaling introduce tunable hyperparameters.
Projecting via PCA may discard information in lower-variance modes.

Potential extensions include using alternative third-moment distances and extending SID to new use-cases or domains with non-Gaussian feature distributions (Luzi et al., 2023).

7. Relationship to Other Moment-Based Metrics

SID generalizes FID by addressing its main limitation: the latter’s reduction of all discrepancy to mean and covariance (two moments), justified only if features are Gaussian. Empirically, many real-data features are strongly non-Gaussian, as evidenced by statistical tests for skewness post-PCA (Luzi et al., 2023). By incorporating skewness, SID provides a more discriminating and nuanced measure for matching real and generated distributions, especially when evaluating visual fidelity and diversity under practical and perceptually relevant distortions.

Markdown Report Issue Upgrade to Chat

References (1)

Using Skew to Assess the Quality of GAN-generated Image Features (2023)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Skew Inception Distance (SID).

Skew Inception Distance (SID) Explained

1. Mathematical Definition

2. Metricity and Pseudometric Properties

3. Practical Computation and PCA Acceleration

4. Behavior Under Distortions and Empirical Performance

5. Applications Beyond GAN Evaluation

6. Computational Complexity, Limitations, and Extensions

7. Relationship to Other Moment-Based Metrics

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Skew Inception Distance (SID) Explained

1. Mathematical Definition

2. Metricity and Pseudometric Properties

3. Practical Computation and PCA Acceleration

4. Behavior Under Distortions and Empirical Performance

5. Applications Beyond GAN Evaluation

6. Computational Complexity, Limitations, and Extensions

7. Relationship to Other Moment-Based Metrics

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research