Constant Error Carousel (CEC)
- Constant Error Carousel (CEC) is a method for online noisy label detection in speaker recognition that tracks temporal prediction errors across epochs.
- It employs Continuous Inconsistent Counting (CIC) and Total Inconsistent Counting (TIC) to filter out mislabeled data while progressively introducing hard samples using curriculum learning.
- CEC outperforms traditional techniques by avoiding extra network complexity and prior noise estimation, significantly lowering error rates on benchmarks like VoxCeleb.
Constant Error Carousel (CEC) is an online method for noisy label detection, strategically designed for speaker recognition tasks. Leveraging temporal consistency of prediction errors over successive training epochs, CEC employs statistical metrics to isolate and remove mislabelled data, thereby enhancing model robustness and verification accuracy. Its approach is curriculum-based, does not require prior knowledge of noise proportions, and avoids extra network complexity, setting it apart from traditional filtering techniques.
1. Motivation and Design Principles
Noisy labels in speaker recognition datasets—common even in “well-annotated” corpora—can significantly degrade training effectiveness and final model performance. Many prior noisy label detection methods are unsuitable for large-scale, dynamic scenarios, requiring either prior noise estimates, supplementary neural architectures, or batch-stage intervention late in training. CEC, in contrast, uses the longitudinal misclassification record of each sample, allowing fast, online, and accurate detection without these dependencies.
CEC’s central premise is that truly noisy labels are inconsistently, and persistently, misclassified across consecutive epochs. By statistically tracking and analyzing these temporal patterns, CEC efficiently isolates noisy samples for exclusion.
2. Epoch-wise Sample Categorization
At each training epoch, every sample is mapped to one of three categories based on model predictions and cosine similarity metrics:
$C_i = \begin{cases} 2, & \text{if } y_i \neq \Tilde{y_i} \ 1, & \text{if } y_i = \Tilde{y_i},\, s_P < \tau_P\, \text{or}\, s_N > \tau_N \ 0, & \text{otherwise} \end{cases}$
Where:
- : current sample category (2: inconsistent, 1: hard, 0: easy)
- : ground truth label
- $\Tilde{y_i}$: predicted label
- : cosine similarity to the correct class vector,
- : maximal cosine similarity to any incorrect class,
- : pre-defined thresholds
Interpretation of categories:
- Inconsistent samples (): repeatedly misclassified.
- Hard samples (): classified correctly but close to the decision boundary.
- Easy samples (): classified confidently and correctly.
3. Statistical Metrics: Continuous and Total Inconsistent Counting
CEC introduces two noise-tracking statistics:
- Continuous Inconsistent Counting (CIC): measures streak length of consecutive misclassifications.
- Total Inconsistent Counting (TIC): aggregates all misclassification occurrences per sample across epochs.
Formally,
CIC is sensitive to early, consecutive errors; TIC accumulates all inconsistent occurrences, robust to intermittent fluctuation.
4. Noisy Label Filtering via Thresholding
A sample is flagged as noisy and removed from further training if either CIC or TIC exceeds its respective threshold:
Here:
- : CIC threshold for early aggressive filtering
- : TIC threshold for comprehensive late-stage removal
- : logical or
This choreography allows rapid exclusion of “obvious” noisy labels via CIC and mop-up of persistent, less blatant cases via TIC. Removal is performed immediately upon threshold breach.
5. Curriculum-based Handling of Hard Samples
To safeguard against the accidental exclusion of challenging—but valid—samples, a curriculum learning paradigm is employed:
- During initial epochs, updates are restricted to easy samples.
- As the model improves, hard samples are gradually “unlocked” for weight modification according to an adaptive threshold tied to the epoch index.
- Only hard samples with difficulty () beneath the current threshold are used for updates.
This staged introduction mitigates overfitting to noise and enhances generalization capacity.
6. Comparative Perspective
The following table summarizes CEC’s features against established noisy label detection schemes:
| Method | Needs Prior Noise Proportion? | Filtering Stage | Extra Network/Data? | Robustness to Noise | Reported Performance |
|---|---|---|---|---|---|
| Co-teaching | Yes | Late/Batch | Yes | Moderate | Good on clean/less noisy data |
| O2U-Net | Yes | Late/Batch | No | Good | Best at low noise |
| OR-Gate | Yes | Per-epoch | No | Low/variable | Weak for augmented data |
| CEC | No | Online/Immediate | No | High | Best on noisy/real data |
CEC uniquely delivers online, high-recall filtering, without prior noise ratio estimation or auxiliary network overhead.
7. Empirical Results and Significance
Experimental results on VoxCeleb and synthetically corrupted benchmarks demonstrate that CEC yields lower Equal Error Rates at all noise levels, particularly outperforming baselines where label corruption is substantial. Both recall and precision for noisy label detection are competitive or superior compared to leading methods. The dual-metric counting, in tandem with curriculum learning, ensures that neither underfitting nor overfitting to noise predominates.
A plausible implication is that CEC’s label filtering can be generalized to other classification regimes where consistency over training epochs reflects ground truth fidelity.
8. Summary
CEC (Constant Error Carousel) is a principled, efficient strategy for noisy label detection in speaker recognition. It relies on the statistical persistence and totality of error patterns across training epochs to differentiate “true” noisy samples from hard and easy data using online, threshold-based pruning. CIC facilitates rapid initial filtering, whereas TIC enables persistent, late-stage refinement. The incorporation of curriculum learning for hard samples ensures the method is robust to both excessive pruning and label pollution, contributing to improved overall model reliability and verification accuracy.
Sponsored by Paperpile, the PDF & BibTeX manager trusted by top AI labs.
Get 30 days free