Intra-Class Splitting (ICS) Explained
- Intra-Class Splitting (ICS) is a technique that decomposes a normal class into 'typical' and 'atypical' subsets based on internal structure for anomaly detection.
- The method uses autoencoders to generate reconstruction-based similarity scores, designating fringe samples as proxies for anomalies and enabling supervised learning.
- ICS improves recognition performance by optimizing closeness among typical samples and dispersion with atypical samples, demonstrating success on high-dimensional datasets.
Intra-Class Splitting (ICS) is a methodology that decomposes a given class—most often the “normal” or positive class in one-class or open-set classification—into two disjoint subsets, labeled “typical” and “atypical,” based entirely on internal structure revealed by the data. The typical subset captures the core distributional modes of the known class, while the atypical subset is comprised of statistically or representationally marginal (i.e., fringe) samples, which then serve as a surrogate for unknown or abnormal data. ICS enables the re-casting of one-class or open-set recognition problems into a supervised learning context by providing a pseudo-negative or “unknown” class for end-to-end discriminative feature and decision boundary learning. ICS operates without reference to explicit negative samples during training and has been demonstrated to be effective in deep feature learning, one-class classification, and open-set recognition on high-dimensional visual datasets (Schlachter et al., 2018, Schlachter et al., 2019, Schlachter et al., 2019).
1. ICS Formalization and Motivation
In the context of one-class classification, the normal class is the only observed category during training; there are no explicit abnormal or outlier samples available. Traditional approaches such as one-class SVM or deep autoencoder-based anomaly detection either require external reference data to pretrain feature representations or are limited by their reliance on pixel-wise errors that may not correspond to semantic anomalies (Schlachter et al., 2018, Schlachter et al., 2019). ICS addresses this by synthetically generating a pseudo-negative class from the structure of the normal data itself.
For a given dataset of normal instances, an autoencoder is trained to reconstruct inputs. Each sample is scored by a similarity metric (commonly SSIM or negative MSE) between and its autoencoder reconstruction . By thresholding this score at a fixed ratio (e.g., 10%), the lowest scoring samples are assigned as “atypical,” with the remaining forming the “typical” core. The underlying motivation is that samples poorly reconstructed by the autoencoder are more likely to represent the fringe or outlier modes of the data, thus serving as effective stand-ins for actual anomalies (Schlachter et al., 2018, Schlachter et al., 2019, Schlachter et al., 2019).
2. ICS Algorithms and Network Architectures
The generic ICS workflow is composed of three phases:
- Autoencoder Pretraining: A deep autoencoder is trained to minimize reconstruction loss across the normal dataset (Schlachter et al., 2018, Schlachter et al., 2019).
- Splitting Phase: For every sample , compute 0, order all instances by ascending 1, and select the lowest 2 fraction as 3, with 4. SSIM or negative MSE are common choices for 5; 6 is typically in 7 (Schlachter et al., 2018, Schlachter et al., 2019, Schlachter et al., 2019).
- Joint Training: The neural network (comprising feature extraction, classification, and sometimes a distance subnetwork) is optimized to achieve:
- Closeness among typicals: The closeness loss 8 encourages latent vectors from the typical subset to concentrate.
- Dispersion between and among atypicals: Dispersion losses 9 and 0 enforce that atypical normals are distributed apart from each other and from the typicals in latent space.
- Binary or multiclass discrimination: A standard cross-entropy or binary loss discriminates (for one-class) typical versus atypical, or (for open-set) the known 1 classes versus “atypical” relabeled as 2 (Schlachter et al., 2019, Schlachter et al., 2019).
A combined loss of the form 3 is optimized, with coefficients chosen to balance compactness and separation (Schlachter et al., 2018).
In open-set settings, the typical and atypical splits are performed within each class using a classifier’s softmax confidence as the splitting criterion. All atypicals are pooled and relabeled as an extra “unknown” class, enabling straightforward 4-class softmax training. Closed-set accuracy is preserved via a dual-head architecture: the main (OS) head produces 5 probabilities, and the auxiliary (CS) head preserves known-class discrimination using only the first 6 classes (Schlachter et al., 2019).
3. Loss Functions and Training Dynamics
ICS methods fundamentally rely on a combination of closeness and dispersion objectives in the latent space, in addition to the conventional autoencoder reconstruction and (bi/multi)-class discrimination losses.
- Closeness Loss (typical samples):
7
where 8, 9 is the latent dimensionality, and the sum spans typical pairs. This enforces the compactness of the typical subset (Schlachter et al., 2018, Schlachter et al., 2019).
- Dispersion Losses (atypical samples):
0
1
These encourage atypical samples to spread and to be distant from the typical cluster (Schlachter et al., 2018).
- Binary/Multi-Class Losses: For binary ICS in one-class settings, binary cross-entropy on the “typical”/“atypical” labels is used:
2
For open-set, cross-entropy is extended to 3-way multiclass (Schlachter et al., 2019).
Training alternates or jointly optimizes these losses, with ablation studies indicating that the removal of dispersive constraints collapses the latent space and diminishes anomaly discrimination (Schlachter et al., 2018, Schlachter et al., 2019).
4. Empirical Results and Benchmarks
ICS methods have been systematically evaluated on MNIST, Fashion-MNIST, CIFAR-10, and SVHN. In one-class feature learning, ICS encoders coupled with a one-class SVM achieve superior balanced accuracy compared to conventional autoencoder baselines, transfer learning from ImageNet, or prior OCC approaches:
| Dataset | ICS (mean ± std) | CAE | Deep SVDD | ImageNet+SVM | Original AE |
|---|---|---|---|---|---|
| MNIST | 91.3% ± 0.7 | 85.2% | — | 68.7% | 84.2% |
| FMNIST | 87.7% ± 0.5 | 81.6% | — | 65.2% | 82.3% |
| CIFAR-10 | 60.6% ± 1.2 | 56.9% | — | 53.7% | 56.5% |
ICS achieves the highest area under the ROC curve (AUC) in 4 of 5 one-class tasks across datasets, outperforming Deep SVDD by 6 AUC on CIFAR-10 and also prevailing over classical methods (Schlachter et al., 2019, Schlachter et al., 2018).
For open-set recognition, ICS with closed-set regularization attains balanced accuracy of 7 (MNIST), 8 (SVHN), and 9 (CIFAR-10), consistently surpassing Weibull SVM, OCSVM, autoencoder-based splits, and counterfactual generation approaches. ICS remains robust under increasing openness conditions, with performance degrading gracefully as the number of unknown classes increases (Schlachter et al., 2019). Ablation experiments confirm that both the splitting procedure and latent space constraints are indispensable for optimal performance.
5. Applications: One-Class and Open-Set Recognition
ICS is primarily targeted at two application domains:
- One-Class Classification (OCC): ICS transforms the OCC setting into an in-situ binary problem by treating atypical normal samples as a proxy for anomalies. The network learns features that tightly enclose the core distribution of the normal class while providing explicit separation from its synthesized fringe (Schlachter et al., 2018, Schlachter et al., 2019).
- Open-Set Recognition (OSR): In OSR, ICS splits each known class into typical and atypical subsets (using classifier softmax confidence), relabels all atypicals across classes as “unknown,” and trains an 0-way classifier augmented with a closed-set regularizer to maintain within-class accuracy. During inference, samples assigned to the “unknown” output are rejected (Schlachter et al., 2019).
Empirical results underscore that ICS-based methods yield substantial gains over classical anomaly detection, SVM-based, and generative adversarial learning-based open-set baselines.
6. Generalization, Limitations, and Extensions
ICS is effective when the normal data exhibit a coherent mode with a dense core and a meaningful fringe. However, when the normal class is multi-modal without a clear core, or when atypical samples themselves form secondary clusters, the utility of the split may be undermined; forcing dispersion within such a structure could remove useful discriminative information or distort semantically meaningful substructure (Schlachter et al., 2018).
Extensions proposed in the literature include:
- Adaptive splitting: Optimizing the ratio 1 or implementing a learned threshold via bi-level optimization.
- Alternative splitting criteria: Employing Gaussian mixture model likelihoods, 2-means distance, or one-class SVM outputs in place of autoencoder-based similarity.
- End-to-end differentiable splitting: Incorporating the splitting as a learnable percentile or sorting module within the network forward pass (Schlachter et al., 2018).
- Regularization strategies: Closed-set heads in open-set networks to maintain in-domain class accuracy after relabeling the atypical subset (Schlachter et al., 2019).
Existing methods are predominantly validated on visual data; generalization to non-image or non-Euclidean domains remains unproven (Schlachter et al., 2019).
7. Comparative Table: ICS Key Configurations
| ICS Variant | Split Criterion | Loss Components | Primary Domain |
|---|---|---|---|
| 2018 One-Class (Schlachter et al., 2018) | Autoencoder SSIM/MSE | 3, 4, 5 | One-class / OCC |
| 2019 Deep One-Class (Schlachter et al., 2019) | Autoencoder SSIM | Closeness, binary X-ent, dispersion | Deep OCC |
| 2019 Open-Set (Schlachter et al., 2019) | Classifier softmax confidence | 6-class X-ent, closed-set reg | Open-set recognition |
Closeness and dispersion losses are central in all variants; the split is performed according to autoencoder similarity in unsupervised settings and via classifier confidence in open-set settings. End-to-end discriminative learning is enabled in every case by introducing pseudo-negative examples.
References
- (Schlachter et al., 2018) One-Class Feature Learning Using Intra-Class Splitting
- (Schlachter et al., 2019) Deep One-Class Classification Using Intra-Class Splitting
- (Schlachter et al., 2019) Open-Set Recognition Using Intra-Class Splitting