Papers
Topics
Authors
Recent
2000 character limit reached

FAED: Fréchet Autoencoder Distance Explained

Updated 12 December 2025
  • FAED is a synthetic image quality metric that replaces InceptionV3 with a lightweight convolutional autoencoder, employing Monte Carlo dropout for uncertainty quantification.
  • The approach computes multiple FAED scores by generating empirical Gaussians from stochastic CAE encodings and comparing them to a deterministic reference via squared Fréchet distance.
  • Uncertainty measures—predictive variance and FAED standard deviation—serve to identify out-of-distribution samples and assess the reliability of synthetic image quality evaluations.

The Fréchet Autoencoder Distance (FAED) is a synthetic image quality metric structurally analogous to the well-known Fréchet Inception Distance (FID), but replaces the InceptionV3 feature model with a lightweight convolutional autoencoder (CAE). FAED further introduces uncertainty quantification (UQ) by employing Monte Carlo dropout throughout the encoder, producing not only a distribution of metric values but also explicit, interpretable uncertainty measures—namely the predictive variance of latent embeddings (“pVar”) and the standard deviation of the FAED scores (σFAED\sigma_{\mathrm{FAED}}). These uncertainty scores offer heuristic indicators of the degree to which the samples under evaluation are out-of-distribution relative to the embedding model’s training domain, directly informing the trustworthiness of FAED as a synthetic image quality assessment tool (Bench et al., 4 Apr 2025).

1. Formal Definition and Methodology

Given a “test” image set X^={x^1,,x^N}\hat{X} = \{\hat{x}_1, \dotsc, \hat{x}_N\} and a “reference” set Y={y1,,yN}Y = \{y_1, \dotsc, y_N\}, both consisting of normalized 128×128128 \times 128 RGB images, the process employs a trained CAE GG with encoder EE producing DD-dimensional latent codes l=E(x)l = E(x). Monte Carlo dropout is activated in every encoder layer with a 10% rate during both training and (crucially) evaluation, modeling epistemic uncertainty. For each test image, MM stochastic forward passes yield li,1,,li,MRDl_{i,1}, \dotsc, l_{i,M} \in \mathbb{R}^D.

For each Monte Carlo sample jj, embeddings across the test set define an empirical Gaussian NX^(j)(μX^(j),ΣX^(j))\mathcal{N}_{\hat{X}}^{(j)}\bigl(\mu_{\hat{X}}^{(j)}, \Sigma_{\hat{X}}^{(j)}\bigr), where:

  • μX^(j)=1Ni=1Nli,j\mu_{\hat{X}}^{(j)} = \frac{1}{N}\sum_{i=1}^N l_{i,j}
  • ΣX^(j)=1Ni=1N(li,jμX^(j))(li,jμX^(j))\Sigma_{\hat{X}}^{(j)} = \frac{1}{N}\sum_{i=1}^N (l_{i,j} - \mu_{\hat{X}}^{(j)})(l_{i,j} - \mu_{\hat{X}}^{(j)})^\top

The reference statistics (μy,Σy)(\mu_y, \Sigma_y) are computed deterministically from E(yi)E(y_i). The jjth FAED score is the squared Fréchet distance between the two Gaussians:

FAED(j)(X^,Y)=μX^(j)μy22+Tr(ΣX^(j)+Σy2(ΣX^(j)Σy)1/2)\mathrm{FAED}^{(j)}(\hat{X}, Y) = \| \mu_{\hat{X}}^{(j)} - \mu_{y} \|_2^2 + \mathrm{Tr}\left( \Sigma_{\hat{X}}^{(j)} + \Sigma_{y} - 2(\Sigma_{\hat{X}}^{(j)} \Sigma_{y})^{1/2} \right)

The procedure results in MM FAED scores per evaluation.

2. CAE Architecture and Training Protocol

The CAE adopts a standard design:

  • Encoder EE: three 4×44 \times 4 convolution layers (stride 2; channel progression 3\to128\to256\to512), each followed by ReLU and 10% dropout, then flattened and mapped to a 256-dimensional latent by a linear layer.
  • Decoder DD: linear up-projection to 512×16×16512 \times 16 \times 16, three transposed convolutions (mirroring encoder channels in reverse) with ReLU, reconstructing to three output channels.
  • Loss: pixelwise MSE.
  • Training data: ImageWoof (9,035 train / 3,929 val), 25 epochs, batch size 16, Adam optimizer.

Dropout is active throughout the encoder during all stages, ensuring the CAE’s latent space exposes epistemic uncertainty in downstream metrics.

3. Uncertainty Quantification in FAED

Monte Carlo dropout, maintained at inference time, produces a distribution of CAE encodings for each input. Uncertainty quantification is achieved via two principal measures:

  • Predictive variance (pVar): captures average variance across latent dimensions and input images:

pVar=1Dk=1D[1Ni=1NVarj=1..M(li,j,k)]\mathrm{pVar} = \frac{1}{D}\sum_{k=1}^D\left[\frac{1}{N}\sum_{i=1}^N \mathrm{Var}_{j=1..M}(l_{i,j,k})\right]

This reflects epistemic uncertainty inherent in the feature mapping.

  • Standard deviation of FAED (σFAED\sigma_{\mathrm{FAED}}): quantifies how embedding variance amplifies through the Fréchet distance computation:

σFAED=1Mj=1M(FAED(j)FAED)2\sigma_{\mathrm{FAED}} = \sqrt{\frac{1}{M}\sum_{j=1}^M (\mathrm{FAED}^{(j)} - \overline{\mathrm{FAED}})^2}

The magnitude of both metrics empirically correlates with the degree of input domain shift. High values flag increasing unreliability of the FAED itself. These measures are to be reported alongside the mean FAED for comprehensive interpretability (Bench et al., 4 Apr 2025).

4. Reference Implementation and Pseudocode

The computation proceeds as follows:

  • Deterministically compute (μy,Σy)(\mu_y, \Sigma_y) from YY (without dropout).
  • For j=1,,Mj = 1, \dotsc, M:
    • Draw a stochastic encoding li,jl_{i,j} for every x^i\hat{x}_i in X^\hat{X} using dropout.
    • Aggregate to μX^(j)\mu_{\hat{X}}^{(j)}, ΣX^(j)\Sigma_{\hat{X}}^{(j)}.
    • Compute FAED(j)\mathrm{FAED}^{(j)} via the Fréchet distance between the empirical and reference Gaussians.
  • Return [FAED(1),,FAED(M)][\mathrm{FAED}^{(1)},\dotsc,\mathrm{FAED}^{(M)}].

1
2
3
4
5
6
7
8
9
10
11
12
for i in range(N):
    z_i = E(y_i)  # deterministic
μ_y   = mean(z_1, ..., z_N)
Σ_y   = cov(z_1, ..., z_N)

for j in range(M):
    for i in range(N):
        l_{i,j} = E(x̂_i)  # with dropout
    μ_{x̂}^{(j)} = mean(l_{1,j}, ..., l_{N,j})
    Σ_{x̂}^{(j)} = cov(l_{1,j}, ..., l_{N,j})
    FAED^{(j)} = ||μ_{x̂}^{(j)} - μ_y||^2 + Tr(Σ_{x̂}^{(j)} + Σ_y - 2*(Σ_{x̂}^{(j)}Σ_y)^{1/2})

5. Empirical Validation and Domain Sensitivity

FAED’s sensitivity to domain shift and OOD effects is established via controlled perturbations of the evaluation data. Using the ImageWoof dataset for CAE training, five test sets were constructed, ordered by increasing “domain gap”:

Test set Mean FAED σ_FAED pVar
ImageWoof (baseline) 25.923 0.019 0.0051
ImageWoof + 2% Gaussian noise 29.183 0.021 0.0042
ImageWoof + 5 self-overlays 50.276 0.030 0.0062
ImageWoof + 5 random Imagenette overlays 62.225 0.042 0.0070
Imagenette (no dog) 143.932 0.085 0.0082

The monotonic increase of mean FAED, σFAED\sigma_{\mathrm{FAED}}, and pVar with domain shift validates the metric’s effectiveness as a quality indicator and the utility of its uncertainty scores for evaluating trustworthiness. The nearly linear dependency on perturbation severity (as in Figure 1) confirms fine-grained sensitivity (Bench et al., 4 Apr 2025).

6. Interpretation and Practical Guidance

Practical use of FAED mandates co-reporting of mean, pVar, and σFAED\sigma_{\mathrm{FAED}}. Key insights:

  • Diagnostic role of uncertainty metrics: Large pVar or σFAED\sigma_{\mathrm{FAED}} indicates that latent encodings are sampled from regions not well supported by the CAE's training data, flagging possible unreliability of the FAED. pVar isolates epistemic uncertainty at the embedding level; σFAED\sigma_{\mathrm{FAED}} includes compounded uncertainty effects in the FAED formula.
  • Domain specificity: For evaluation outside natural image domains (e.g., medical imaging), the CAE must be trained on in-domain data. Only trust FAED where uncertainties remain low by empirical thresholds.
  • Dropout tuning: The dropout rate impacts uncertainty calibration and must be set for the task, or augmented with additional UQ techniques as needed.
  • Summary: FAED extends FID, combining metric conciseness with explicit UQ, enabling robust application in settings where the reliability of synthetic image quality judgment is essential.

7. Relation to FID and Broader Implications

FAED redirects the dependency from an ImageNet-pretrained classifier (InceptionV3) to an unsupervised, domain-adaptive CAE. This replacement is particularly significant for fields where pretrained discriminative models may lack epistemic coverage—most notably in scientific or medical image synthesis. Uncertainty quantification via Monte Carlo dropout provides explicit trustworthiness signals, which are critical for high-stakes applications. A plausible implication is that similar UQ procedures could be integrated with other synthetic data metrics by analogously equipping embedding models with dropout or Bayesian techniques, although this is not explicit in the current results (Bench et al., 4 Apr 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Whiteboard

Follow Topic

Get notified by email when new papers are published related to Fréchet Autoencoder Distance (FAED).