Centre-Enhanced Discriminative Learning (CEDL)

Updated 22 November 2025

CEDL is a learning framework that integrates centre-based geometric constraints into loss functions, unifying discrimination with geometric compactness.
It combines softmax and centre losses to improve cluster tightness, decision boundary clarity, and interpretability in diverse applications.
The method demonstrates robust performance across modalities, enhancing both multi-class tasks like speech emotion recognition and binary tasks like anomaly detection.

Centre-Enhanced Discriminative Learning (CEDL) denotes a family of end-to-end learning frameworks in which "centre-based" geometric constraints serve to enhance the discriminative power of neural representations. CEDL methods integrate class- or group-specific geometric centers into loss functions, pulling representations of the same class or normality closer together in a learned space, while ensuring inter-class or anomaly vs. normal separability. The approach has been applied in both multi-class settings such as speech emotion recognition and binary settings such as supervised anomaly detection, providing principled, interpretable, and often backbone-agnostic improvements in cluster compactness and decision boundary clarity (Dai et al., 2 Jan 2025, Darban et al., 15 Nov 2025).

1. Core Principles and Mathematical Underpinnings

The central mechanism of CEDL is the introduction of explicit centre-based objectives into the classification or anomaly detection loss, resulting in representational spaces where classes or groups are both separated and internally compact.

Multi-Class Centre Loss Integration

In multi-class settings, the loss function combines a standard softmax cross-entropy term, which ensures inter-class separability, with a "center loss" term that reduces intra-class variance:

$L = L_{\mathrm{softmax}} + \lambda L_{\mathrm{center}}$

where $L_{\mathrm{softmax}}$ is the softmax cross-entropy loss, and $L_{\mathrm{center}} = \frac{1}{m}\sum_{i=1}^m \|z_i - c_{y_i}\|_2^2$ penalizes distance from each embedding $z_i$ to its class center $c_{y_i}$ (Dai et al., 2 Jan 2025). The trade-off parameter $\lambda$ is tuned to balance between class separability and intra-class compactness.

Radial Centre-Based Logits for Binary/Anomaly Detection

In binary or anomaly detection contexts, a radial centre-based logit replaces the conventional linear logit, defining normality in terms of Euclidean distance to a centre $c$ :

$a_i = \frac{\alpha}{\sqrt{D}\,\|r_i - c\|_2}$

$s_i = \sigma(a_i) = \frac{1}{1 + \exp(-a_i)}$

This score is used in a centre-enhanced weighted binary cross-entropy loss (“CEDL loss”):

$\ell_{\mathrm{CEDL}}(r_i, y_i) = w_1 y_i \mathrm{softplus}(-a_i) + w_0 (1-y_i) \mathrm{softplus}(a_i)$

This loss unifies discrimination (by class label) with a geometric constraint (distance from centre encodes normality) (Darban et al., 15 Nov 2025).

2. Network Architectures and Centre Dynamics

CEDL frameworks are modality-agnostic in principle, adapting the encoder design to the data modality while integrating centre-based logic at the representation layer.

Speech Emotion Recognition Example

Input: Variable-length spectrograms ( $L_T \times L_F$ ). Log-Mel and log-STFT spectral representations are supported.
Encoder: CNN stack for local pattern extraction, followed by bi-directional RNN (GRU) for temporal integration. The CNN output is a sequence; the final GRU states form a 256-dimensional vector.
Embedding: The vector is projected via a fully-connected (FC1) layer with PReLU to a $d=64$ –dimensional space.
Classifier: FC2 with softmax provides emotion logits.
Centre Management: Per-class centres $c_k$ are initialized to zero and updated per mini-batch. Updates are convex combinations of previous and batch mean (controlled by parameter $\alpha$ ), ensuring gradual and smooth movement of class centres (Dai et al., 2 Jan 2025).

Anomaly Detection Example

Encoder ( $\psi$ ): Backbone is data-dependent:
- Tabular: 4-layer MLP with decreasing dimensions and tanh bottleneck.
- Time-series: 1D convolutional residual networks with batch normalization.
- Images: LeNet-style CNNs adapted to dataset (e.g., MNIST, CIFAR-10).
Centre Handling: Typically fixed at the origin, but may be jointly learned.
Output and Scoring: Only the centre and encoder are retained at inference. Anomaly score is simply $\|\psi(x) - c\|_2$ (Darban et al., 15 Nov 2025).

3. Training Protocols and Hyperparameter Strategies

CEDL methods share several characteristic training patterns:

Optimizers: Adam is used with learning rates $3\times 10^{-4}$ (Dai et al., 2 Jan 2025) or $1\times 10^{-4}$ (Darban et al., 15 Nov 2025).
Mini-batch Sizes: Typically 32–64, reflecting modality and dataset size.
Loss Weighting: Class imbalance addressed via inverse-frequency weighting (e.g., $\omega_{y_i} \propto 1/\#$ samples of class).
Centre Update Rate ( $\alpha$ ): Centered in [0.2, 0.8], robust to tuning; e.g., $\alpha=0.5$ yielded best results in speech experiments (Dai et al., 2 Jan 2025).
Trade-off ( $\lambda$ or $\alpha$ for scale): Should be tuned via a development fold. For speech emotion, $\lambda \approx 0.3$ maximized gains (Dai et al., 2 Jan 2025).
Early Stopping: Selection by best unweighted accuracy (UA) or minimum CEDL loss on validation.
Class-Centre Initialization: Typically zero-centred; updates only for classes present in batch.

A summary of protocol elements appears in the table:

Component	Speech CEDL (Dai et al., 2 Jan 2025)	Anomaly CEDL (Darban et al., 15 Nov 2025)
Encoder	CNN + bi-GRU	MLP / 1D CNN / LeNet-CNN
Centre Handling	Per-class, updated per batch	Fixed at origin
Loss	Softmax CE + Center loss	Weighted BCE on radial logit
Optimizer	Adam, $3\times10^{-4}$	Adam, $1\times10^{-4}$
$\alpha$ , $\lambda$	$\alpha=0.5$ , $\lambda=0.3$	$\alpha=1.0$ (scale param)

4. Empirical Results and Performance Characteristics

Empirical results demonstrate that CEDL produces measurable improvements in both class-discriminative and geometry-aware objectives.

Speech Emotion Recognition

Dataset: IEMOCAP (5 emotion classes, 5531 utterances).
Log-Mel Input: Baseline UA/WA of 63.80%/61.83% improved to 66.86%/65.40% (>3% absolute gain).
Log-STFT Input: Baseline UA/WA of 60.98%/58.93% improved to 65.13%/62.96% (>4% absolute gain).
Cluster Compactness: PCA-projected embedding visualizations reveal marked tightening of clusters with centre loss engaged.

Anomaly Detection

Datasets: 12 tabular, 4 time-series, 3 image sets (including MNIST, Fashion-MNIST, CIFAR-10).
Metrics: AUROC, AUPR, best F1.
Average Rank: Across all modalities CEDL attains best average rank:
- Tabular: 1.3/6 (AUROC), 1.8/6 (AUPR), 1.25/6 (F1)
- Time-series: 1.5/6, 1.0/6, 1.0/6
- Images: 1.67/6, 1.33/6, 1.33/6
Edge Cases: Notably, on highly imbalanced temporal datasets (e.g., Yahoo, SMAP), CEDL's AUPR exceeded 0.80 where competitors scored around 0.3–0.5. On 'unseen anomaly' image tasks (CIFAR-10), CEDL improved AUROC by ~2–4% and AUPR by ~5% over next best (Darban et al., 15 Nov 2025).

5. Interpretation, Advantages, and Limitations

Interpretability and Simplicity

Direct Anomaly Scoring: In anomaly detection, the post-training score is the Euclidean distance from the learned center—offering interpretable, calibration-free measures.
No Auxiliary Mining: CEDL does not require complex pairwise/triplet mining or auxiliary SVM classifiers as in contrastive or margin-based schemes.

Unified Losses

End-to-End Geometry and Discrimination: The core loss unifies geometric compactness and discrimination in a single formulation, unlike previous methods that regularize the latent space separately from label prediction (Darban et al., 15 Nov 2025).

Limitations and Workarounds

Single-Centre Assumption: Both centre loss and radial logit CEDL presume unimodality within each class or the normal group. Performance can degrade on datasets with highly multi-modal class distributions.
Hyperparameter Sensitivity: While centre update rate ( $\alpha$ ) is robust, trade-off parameters ( $\lambda$ or scale $\alpha$ ) may require careful cross-validation in extreme imbalance scenarios.
Radial Saturation: High anomaly rates (>15–20%) can induce sigmoid saturation, mildly blurring the separation rim in latent space (Darban et al., 15 Nov 2025).

6. Extensions and Broader Applicability

Several prospective extensions and domain adaptations are identified:

Multi-Centre CEDL: Learning multiple centroids to address multi-modal normal or class distributions.
Semi-Supervised and Few-Shot: Leveraging unlabeled data or minimal positive samples in few-shot regimes.
Domain Adaptation: Transferring learned centres and scales across domains without re-training the complete model.
Structured Modalities: Embedding centre-distance logic in architectures for graphs (GNNs) or sequences (Transformers) (Darban et al., 15 Nov 2025).
Other Domains: The underlying principle generalizes to audio, vision, tabular data, and time series, supporting broad use in problems with ambiguous or imbalanced classes.

7. Best Practices and Visualization Insights

PCA projections of latent embeddings should be used for diagnostic inspection of cluster tightness and separation.
Trade-off parameter ( $\lambda$ ) should be tuned over 0.1–0.5 for most tasks to maximize the benefit of compactness without underweighting the primary classification objective.
Embedding normalization is unnecessary beyond batch normalization if centre loss is used directly on raw embeddings.
A straightforward training scheme—centre initialization at zero, batch-wise updates with $\alpha$ around 0.5, combined classification and centre loss, and backbone-agnostic encoders—yields robust results and direct applicability to new domains.

In summary, Centre-Enhanced Discriminative Learning offers a unifying framework for integrating geometric centre constraints into discriminative tasks, facilitating interpretable, robust, and high-performing solutions in both class-balanced and highly imbalanced regimes across modalities (Dai et al., 2 Jan 2025, Darban et al., 15 Nov 2025).

PDF Markdown Chat (Pro)

References (2)

learning discriminative features from spectrograms using center loss for speech emotion recognition (2025)

CEDL: Centre-Enhanced Discriminative Learning for Anomaly Detection (2025)

Whiteboard

Generate a whiteboard explanation of this topic.

Topic to Video (Beta)

Generate a video overview of this topic.

Follow Topic

Get notified by email when new papers are published related to Centre-Enhanced Discriminative Learning (CEDL).

Centre-Enhanced Discriminative Learning (CEDL)

1. Core Principles and Mathematical Underpinnings

Multi-Class Centre Loss Integration

Radial Centre-Based Logits for Binary/Anomaly Detection

2. Network Architectures and Centre Dynamics

Speech Emotion Recognition Example

Anomaly Detection Example

3. Training Protocols and Hyperparameter Strategies

4. Empirical Results and Performance Characteristics

Speech Emotion Recognition

Anomaly Detection

5. Interpretation, Advantages, and Limitations

Interpretability and Simplicity

Unified Losses

Limitations and Workarounds

6. Extensions and Broader Applicability

7. Best Practices and Visualization Insights

Whiteboard

Topic to Video (Beta)

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Centre-Enhanced Discriminative Learning (CEDL)

1. Core Principles and Mathematical Underpinnings

Multi-Class Centre Loss Integration

Radial Centre-Based Logits for Binary/Anomaly Detection

2. Network Architectures and Centre Dynamics

Speech Emotion Recognition Example

Anomaly Detection Example

3. Training Protocols and Hyperparameter Strategies

4. Empirical Results and Performance Characteristics

Speech Emotion Recognition

Anomaly Detection

5. Interpretation, Advantages, and Limitations

Interpretability and Simplicity

Unified Losses

Limitations and Workarounds

6. Extensions and Broader Applicability

7. Best Practices and Visualization Insights

Sponsor

Whiteboard

Topic to Video (Beta)

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research