Constant Error Carousel (CEC)

Updated 4 November 2025

Constant Error Carousel (CEC) is a method for online noisy label detection in speaker recognition that tracks temporal prediction errors across epochs.
It employs Continuous Inconsistent Counting (CIC) and Total Inconsistent Counting (TIC) to filter out mislabeled data while progressively introducing hard samples using curriculum learning.
CEC outperforms traditional techniques by avoiding extra network complexity and prior noise estimation, significantly lowering error rates on benchmarks like VoxCeleb.

Constant Error Carousel (CEC) is an online method for noisy label detection, strategically designed for speaker recognition tasks. Leveraging temporal consistency of prediction errors over successive training epochs, CEC employs statistical metrics to isolate and remove mislabelled data, thereby enhancing model robustness and verification accuracy. Its approach is curriculum-based, does not require prior knowledge of noise proportions, and avoids extra network complexity, setting it apart from traditional filtering techniques.

1. Motivation and Design Principles

Noisy labels in speaker recognition datasets—common even in “well-annotated” corpora—can significantly degrade training effectiveness and final model performance. Many prior noisy label detection methods are unsuitable for large-scale, dynamic scenarios, requiring either prior noise estimates, supplementary neural architectures, or batch-stage intervention late in training. CEC, in contrast, uses the longitudinal misclassification record of each sample, allowing fast, online, and accurate detection without these dependencies.

CEC’s central premise is that truly noisy labels are inconsistently, and persistently, misclassified across consecutive epochs. By statistically tracking and analyzing these temporal patterns, CEC efficiently isolates noisy samples for exclusion.

2. Epoch-wise Sample Categorization

At each training epoch, every sample is mapped to one of three categories based on model predictions and cosine similarity metrics:

$C_i = \begin{cases} 2, & \text{if } y_i \neq \Tilde{y_i} \ 1, & \text{if } y_i = \Tilde{y_i},\, s_P < \tau_P\, \text{or}\, s_N > \tau_N \ 0, & \text{otherwise} \end{cases}$

Where:

$C_i$ : current sample category (2: inconsistent, 1: hard, 0: easy)
$y_i$ : ground truth label
$\Tilde{y_i}$: predicted label
$s_P$ : cosine similarity to the correct class vector, $s_P = \cos(\theta_{y_i, i})$
$s_N$ : maximal cosine similarity to any incorrect class, $s_N = \max_{j \neq y_i} \cos(\theta_{j,i})$
$\tau_P, \tau_N$ : pre-defined thresholds

Interpretation of categories:

Inconsistent samples ( $C_i=2$ ): repeatedly misclassified.
Hard samples ( $C_i=1$ ): classified correctly but close to the decision boundary.
Easy samples ( $C_i=0$ ): classified confidently and correctly.

3. Statistical Metrics: Continuous and Total Inconsistent Counting

CEC introduces two noise-tracking statistics:

Continuous Inconsistent Counting (CIC): measures streak length of consecutive misclassifications.
Total Inconsistent Counting (TIC): aggregates all misclassification occurrences per sample across epochs.

Formally,

$CIC_m^i = \begin{cases} CIC_{m-1}^i + 1, & \text{if } C_i=2 \ 0, & \text{otherwise} \end{cases}$

$TIC_m^i = \begin{cases} TIC_{m-1}^i + 1, & \text{if } C_i=2 \ TIC_{m-1}^i, & \text{otherwise} \end{cases}$

CIC is sensitive to early, consecutive errors; TIC accumulates all inconsistent occurrences, robust to intermittent fluctuation.

4. Noisy Label Filtering via Thresholding

A sample is flagged as noisy and removed from further training if either CIC or TIC exceeds its respective threshold:

$f_{NL}(x_i) = CIC_m^i > \tau_{cic} \;\bigvee\; TIC_m^i > \tau_{tic}$

Here:

$\tau_{cic}$ : CIC threshold for early aggressive filtering
$\tau_{tic}$ : TIC threshold for comprehensive late-stage removal
$\bigvee$ : logical or

This choreography allows rapid exclusion of “obvious” noisy labels via CIC and mop-up of persistent, less blatant cases via TIC. Removal is performed immediately upon threshold breach.

5. Curriculum-based Handling of Hard Samples

To safeguard against the accidental exclusion of challenging—but valid—samples, a curriculum learning paradigm is employed:

During initial epochs, updates are restricted to easy samples.
As the model improves, hard samples are gradually “unlocked” for weight modification according to an adaptive threshold $\tau_m$ tied to the epoch index.
Only hard samples with difficulty ( $1-s_P$ ) beneath the current threshold are used for updates.

This staged introduction mitigates overfitting to noise and enhances generalization capacity.

6. Comparative Perspective

The following table summarizes CEC’s features against established noisy label detection schemes:

Method	Needs Prior Noise Proportion?	Filtering Stage	Extra Network/Data?	Robustness to Noise	Reported Performance
Co-teaching	Yes	Late/Batch	Yes	Moderate	Good on clean/less noisy data
O2U-Net	Yes	Late/Batch	No	Good	Best at low noise
OR-Gate	Yes	Per-epoch	No	Low/variable	Weak for augmented data
CEC	No	Online/Immediate	No	High	Best on noisy/real data

CEC uniquely delivers online, high-recall filtering, without prior noise ratio estimation or auxiliary network overhead.

7. Empirical Results and Significance

Experimental results on VoxCeleb and synthetically corrupted benchmarks demonstrate that CEC yields lower Equal Error Rates at all noise levels, particularly outperforming baselines where label corruption is substantial. Both recall and precision for noisy label detection are competitive or superior compared to leading methods. The dual-metric counting, in tandem with curriculum learning, ensures that neither underfitting nor overfitting to noise predominates.

A plausible implication is that CEC’s label filtering can be generalized to other classification regimes where consistency over training epochs reflects ground truth fidelity.

8. Summary

CEC (Constant Error Carousel) is a principled, efficient strategy for noisy label detection in speaker recognition. It relies on the statistical persistence and totality of error patterns across training epochs to differentiate “true” noisy samples from hard and easy data using online, threshold-based pruning. CIC facilitates rapid initial filtering, whereas TIC enables persistent, late-stage refinement. The incorporation of curriculum learning for hard samples ensures the method is robust to both excessive pruning and label pollution, contributing to improved overall model reliability and verification accuracy.

PDF Markdown Chat (Pro)

Whiteboard

Generate a whiteboard explanation of this topic.

Topic to Video (Beta)

Generate a video overview of this topic.

Follow Topic

Get notified by email when new papers are published related to Constant Error Carousel (CEC).