Papers
Topics
Authors
Recent
Search
2000 character limit reached

SPADE: Semi-supervised Anomaly Detection

Updated 9 February 2026
  • SPADE is a semi-supervised anomaly detection framework that addresses mismatches between labeled and unlabeled data distributions using ensemble one-class classifiers, self-supervised learning, and adaptive thresholding.
  • It employs partial matching with Wasserstein distance to autonomously set score thresholds, ensuring robust pseudo-label assignment amid diverse anomaly types.
  • Empirical results demonstrate substantial AUC improvements on tabular, image, and fraud detection tasks, validating its practical effectiveness despite distribution shifts.

SPADE refers to the Semi-supervised Pseudo-labeler Anomaly Detection with Ensembling framework for semi-supervised anomaly detection under distribution mismatch (Yoon et al., 2022). This method addresses the challenge where labeled and unlabeled samples originate from different distributions—a frequent violation in practical anomaly detection scenarios. SPADE fuses an ensemble of one-class classifiers with robust, automatic hyperparameter selection, leveraging Wasserstein distance-based partial matching, and integrates self-supervised representation learning for enhanced performance.

1. Problem Formulation and Distribution Mismatch

SPADE is formulated for the generic semi-supervised anomaly detection problem. The dataset consists of a labeled subset Dl={(xil,yil)}i=1Nl\mathcal{D}^l = \{(x_i^l, y_i^l)\}_{i=1}^{N_l} and an unlabeled subset Du={xju}j=1Nu\mathcal{D}^u = \{x_j^u\}_{j=1}^{N_u}, drawn from potentially distinct feature distributions PXl\mathcal{P}_X^l and PXu\mathcal{P}_X^u (i.e., PXlPXu\mathcal{P}_X^l \ne \mathcal{P}_X^u). Labels are y{0,1}y\in\{0,1\}, with y=1y=1 denoting anomaly and anomalous examples being rare: P(y=1)P(y=0)P(y=1)\ll P(y=0). The goal is to learn a classifier f:X[0,1]f:\mathcal{X}\to[0,1] to minimize:

Ex,y(PXlPXu,f(x))[L(f(x),y)]\mathbb{E}_{x,y\sim(\mathcal{P}_X^l \cup \mathcal{P}_X^u, f^*(x))}\left[\mathcal{L}(f(x), y)\right]

without assuming matched distributions between labeled/unlabeled data.

SPADE's primary contribution is its explicit handling of this distribution mismatch, enabling robust learning when, for example, (a) labeled data contains only a subset of anomaly types, (b) labeled data contains only “easy-to-label” normals or anomalies, or (c) unlabeled data possesses non-overlapping anomaly types.

2. Ensemble Pseudo-labeling via One-Class Classifiers

To provide reliable pseudo-labels for unlabeled data amidst distribution mismatch, SPADE constructs an ensemble {ok}k=1K\{o_k\}_{k=1}^K of one-class classifiers (OCCs). Each OCC oko_k is fit on the union of labeled normal samples D0lD_0^l and a unique partition DkuD^u_k from unlabeled data:

  • D0l={xil:yil=0}D_0^l = \{x_i^l : y_i^l = 0\}
  • Du=k=1KDku\mathcal{D}^u = \bigcup_{k=1}^K D_k^u

The OCCs output anomaly scores ok(h(x))Ro_k(h(x))\in\mathbb{R}, where hh is the learnable feature encoder. Pseudo-labels are determined by unanimous voting and score thresholding:

  • Each oko_k uses lower and upper thresholds (ηkn,ηkp)(\eta_k^n, \eta_k^p).
  • For each xux^u:
    • y^knu=1\hat y_k^{nu} = 1 if ok(h(xu))<ηkno_k(h(x^u)) < \eta_k^n else $0$
    • y^kpu=1\hat y_k^{pu} = 1 if ok(h(xu))>ηkpo_k(h(x^u)) > \eta_k^p else $0$
  • The final pseudo-label v(h(xu))v(h(x^u)) is:
    • $0$ (normal) if ky^knu=1\prod_k \hat y_k^{nu} = 1
    • $1$ (anomalous) if ky^kpu=1\prod_k \hat y_k^{pu} = 1
    • 1-1 (uncertain) otherwise

This pseudo-labeling is robust to distribution shift, as only consensus assignments are accepted.

3. Partial Matching: Hyperparameter-Free Thresholding

Hyperparameter selection, particularly score thresholds (ηkn,ηkp)(\eta_k^n, \eta_k^p), is a crucial challenge under mismatch due to lack of validation data. SPADE introduces a "partial matching" principle, tuning thresholds so that the empirical score distributions of (i) positive-labeled samples and high-scoring unlabeled samples, and (ii) negative-labeled samples and low-scoring unlabeled samples, are as close as possible in Wasserstein (earth-mover's) distance:

ηkp=argminηW({ok(h(xil)):yil=1},{ok(h(xu)):ok(h(xu))>η})\eta_k^p = \arg\min_{\eta} W(\{o_k(h(x_i^l)) : y_i^l=1\}, \{o_k(h(x^u)) : o_k(h(x^u)) > \eta\})

ηkn=argminηW({ok(h(xil)):yil=0},{ok(h(xu)):ok(h(xu))<η})\eta_k^n = \arg\min_{\eta} W(\{o_k(h(x_i^l)) : y_i^l=0\}, \{o_k(h(x^u)) : o_k(h(x^u)) < \eta\})

where W(,)W(\cdot, \cdot) is the 1-D Wasserstein distance.

In pure positive-unlabeled (PU) settings, where only one class is labeled, SPADE defaults to Otsu-style thresholding.

4. Overall Algorithmic Workflow

The SPADE training routine alternates ensemble pseudo-labeler construction and parameter updates, integrating self-supervised representation learning for the encoder. The primary steps are:

  • Build pseudo-labeler by partitioning Du\mathcal{D}^u, training KK OCCs on D0lDkuD_0^l\cup D^u_k, and setting thresholds via partial matching.
  • Collect pseudo-labeled points with unanimous OCC consensus.
  • Minimize a total loss across labeled and pseudo-labeled samples, combining:
    • Supervised BCE loss over Dl\mathcal{D}^l
    • BCE loss over pseudo-labeled Du+\mathcal{D}^{u+}
    • Self-supervised objective LselfL_{\text{self}} (e.g., reconstruction/contrastive) over both Dl\mathcal{D}^l and Du\mathcal{D}^u
  • Repeat until convergence; at inference, the output anomaly score is the sigmoid qϕ(hθ(x))q_\phi(h_\theta(x)).

For tabular data, hh is a shallow MLP and oko_k is a Gaussian-mixture density estimator. For images, hh is a ResNet-18, gg is a CutPaste-style projection head, and oko_k is a GDE over representations.

5. Computational Aspects and Complexity

Typical runtime per experiment (single NVIDIA V100):

  • Tabular benchmarks: 1\leq1 hour per run
  • Image benchmarks: 4\leq4 hours per run

The computational burden is dominated by self-supervised encoder updates; OCC ensemble re-training is lightweight (only once per epoch, using shallow estimators).

6. Empirical Performance and Quantitative Benchmarks

SPADE demonstrates state-of-the-art anomaly detection performance under distribution mismatch across tabular, image, and real-world fraud datasets. Key results include:

  • Tabular, “new anomalies” scenario: +10.6% increase in overall AUC (Thyroid: 0.921 vs. 0.815 supervised)
  • Image, “new anomalies”: SPADE achieves 87.9% AUC vs. 81.4% (FixMatch) on MVTec, 85.2% vs. 69.1% on Magnetic datasets
  • Fraud detection: On time-shifted Kaggle credit card data (5% labels), 98.2% AUC (SPADE) vs. 94.1% (VIME), vs. 97.5% (supervised); Xente: 92.0% vs. 85.9%
  • Pure PU tabular: +15% AUC over BaggingPU and Elkanoto on missed anomaly types

In all cases, SPADE’s improvements stem from robust pseudo-labeling, distribution-aware thresholding, and effective self-supervised representation learning.

7. Limitations, Assumptions, and Future Directions

SPADE's performance depends on the quality of the OCC base models; unreliable density estimates on h(x)h(x) can degrade pseudo-labeler precision, though the framework reports pseudo-positive label precision \geq80% at high anomaly score percentiles. The method treats labeled and unlabeled sets symmetrically at inference, so further distributional shifts at test time may warrant explicit adaptation (e.g., via domain-adversarial losses or moment alignment).

SPADE does not explicitly model fine-grained, continuous covariate drift between domains. Potential extensions include: end-to-end neural OCCs supplanting GDEs, adaptive ensemble sizes, learnable thresholding networks, or multi-class anomaly detection under mismatch.

8. Significance and Summary

SPADE provides a canonical, practically robust methodology for semi-supervised anomaly detection under realistic distribution mismatches, unifying an ensemble OCC-based pseudo-labeler, self-supervised feature learning, and unsupervised hyperparameter selection by partial distribution matching. It delivers consistent, notable improvements in AUC for both tabular and vision domains, particularly in settings with non-overlapping anomalies and major label bias, and eliminates reliance on assumption-matched validation splits (Yoon et al., 2022).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to SPADE.