Anchor-Based NCM Loss in Deep Learning

Updated 24 February 2026

Anchor-Based NCM Loss is a deep supervision objective that anchors each class to a fixed prototype, ensuring compact intra-class clustering and clear inter-class separation.
It employs Euclidean and cosine metrics to create well-separated clusters, improving performance in closed-set classification and open set recognition.
Its fixed anchor framework eliminates the need for dynamic center updates, yielding computational efficiency and robust results across various datasets.

The Anchor-Based Nearest Class Mean (NCM) Loss is a family of supervision objectives for deep convolutional neural networks (CNNs) that impart explicit geometric structure in learned features by "anchoring" each class to a fixed, well-separated prototype vector in feature space. Unlike traditional softmax or center-based losses, which either do not directly enforce class clustering or require learned, potentially unstable class centroids, the anchor-based NCM approach imposes both intra-class compactness and inter-class separability through fixed geometric targets. This methodology underpins a range of recent advances in both closed-set classification and distance-based open set recognition, with notable formulations such as the Euclidean and Cosine Anchor-Based NCM losses (Hao et al., 2018), as well as their generalizations and extensions (&&&1&&&).

1. Definition and Theoretical Foundations

Let $\{a_1, a_2, ..., a_C\} \subset \mathbb{R}^d$ denote a set of $C$ fixed anchor vectors, one per class, in the $d$ -dimensional feature space output by a CNN. During training, for an input $x$ with label $y$ , a feature embedding $f_W(x)$ is computed. The central principle is to minimize a loss such that $f_W(x)$ is mapped close to its class anchor $a_y$ and far from all other anchors. The proximity is quantified with a differentiable metric $M(\cdot, \cdot)$ , typically the Euclidean or cosine distance. The anchor vectors are selected according to two principles: (1) all are unit-norm, $\lVert a_c\rVert_2=1$ , to balance feature magnitudes, and (2) they are separated by a minimum pairwise angle, $\angle(a_c,a_{c'}) \geq \theta_M$ for all $c \ne c'$ , to ensure strong inter-class separability. Anchors are constructed by uniform sampling or structured placement (e.g., on the vertices of a regular simplex or via grid-meshing) on the unit hypersphere (Hao et al., 2018).

2. Mathematical Formulation and Loss Variants

Given a metric $M$ , the class-conditional score for each anchor is $s_c(x) = -M(f_W(x), a_c)$ . These scores are passed through a softmax to define conditional probabilities: $p(c|x) = \frac{\exp(-M(f_W(x), a_c))}{\sum_{c'=1}^C \exp(-M(f_W(x), a_{c'}))}$ The Anchor-Based NCM loss is the negative log-likelihood

$L(W) = -\frac{1}{N} \sum_{i=1}^N \ln p(y_i|x_i)$

Two specific choices yield the principal loss variants:

Euclidean NCM (E-NCM): $M_E(f, a) = \|f - a\|_2$
Cosine NCM (C-NCM): $M_C(f, a) = 1 - \frac{f \cdot a}{\|f\|_2 \|a\|_2}$

Further generalizations, such as the Class Anchor Clustering (CAC) loss (Miller et al., 2020), can be written as

$\mathcal{L}_{\text{CAC}}(x, y) = \log \left[ 1 + \sum_{j\neq y} \exp(d_y - d_j) \right] + \lambda \|f(x) - c_y\|_2$

where $d_j = \|f(x) - c_j\|_2$ , and $\lambda$ is a hyperparameter. CAC recovers the standard Anchor-Based NCM loss at $\lambda=1$ .

3. Optimizing Intra-Class and Inter-Class Structure

Minimizing the anchor-based NCM loss explicitly reduces $M(f_W(x_i), a_{y_i})$ (intra-class compactness) while increasing $M(f_W(x_i), a_{c})$ for $c \neq y_i$ (inter-class separability). The geometry of the anchors (far apart, unit norm) ensures that each class's features form tight, well-separated clusters, leading to improved feature discrimination. In E-NCM, norm regularization is inherited from the unit-norm anchors; in C-NCM, the emphasis is on angular clustering. The loss admits gradient computation via standard backpropagation and requires no additional sample selection, center-updating schedules, or margin hyperparameters.

4. Training Methodology and Implementation

Training with anchor-based NCM losses proceeds by fixing anchor vectors for all classes, replacing the standard final classification layer with a distance-to-anchor layer, and optimizing only the network parameters $W$ . The training utilizes standard batch stochastic gradient descent with momentum and weight decay. Data augmentation and dropout ( $p\approx0.1\!-\!0.25$ ) are observed to improve generalization. In CAC, axes-based anchors ( $c_i=\alpha e_i$ ) are used for stability and scalability, with typical hyperparameters $\lambda=0.1$ and $\alpha=10$ (Miller et al., 2020). The loss and its variants are robust to moderate changes in these hyperparameters, demonstrating broad applicability across architectures and datasets.

5. Comparative Analysis with Other Metric Learning Losses

Anchor-Based NCM loss differs from several popular metric-learning objectives:

Loss Type	Pair/Triplet Mining?	Center Updates?	Margin/Hyperparameters
Contrastive/Triplet	Required	N/A	Possible
Center Loss	N/A	Yes (moving avg)	Can be unstable
L-Softmax	N/A	N/A	Margin ( $m$ ), LR, etc.
Anchor-NCM	Not required	None (fixed)	None
CAC	Not required	None (fixed)	Margin $\lambda$

Contrastive and triplet losses require $O(N^2)$ pairwise sampling and careful mining; center loss learns centers online via slow update rules, which can destabilize learning, especially for underrepresented classes. Large-Margin Softmax introduces angular margins but at the cost of additional hyperparameters and backward complexity. Anchor-Based NCM losses avoid all center learning and mining complexities, require only a single sample per update, and have no center- or margin-specific tuning, offering both computational and statistical stability (Hao et al., 2018, Miller et al., 2020).

6. Empirical Performance and Applications

Anchor-Based NCM losses have been validated on canonical closed-set image classification datasets, including MNIST, CIFAR10, and CIFAR100. Key outcomes (Hao et al., 2018) include:

MNIST: >99.5% accuracy, rapid convergence (≈10 epochs), geometry-visualized class clusters tightly coalesced around anchors.
CIFAR10: error reduced from 10.50% (softmax baseline) to 8.89% (E-NCM) and 8.78% (C-NCM).
CIFAR10 with augmentation: E-NCM achieves 5.67% error (vs. 5.94% for L-Softmax).
CIFAR100: E-NCM outperforms L-Softmax by 1.4% absolute error.

In open set recognition, the CAC extension (Miller et al., 2020) achieves state-of-the-art AUROC on six benchmarks, including a 15.2% increase on TinyImageNet over previous methods, while maintaining closed-set accuracy within 1% of cross-entropy networks. Anchored centers consistently outperform learned centers in open set protocols, especially in high-class-count or high-variability regimes.

7. Extensions, Generalizations, and Practical Considerations

Anchor-based NCM methodologies can readily accommodate alternative anchor placements (e.g., any fixed simplex, random orthonormal vectors), learned anchors (with minor architectural extensions), and applications beyond CNNs (provided feature extraction and distance computation are differentiable). CAC’s decomposition into tuple and anchor loss terms facilitates direct control over the margin vs. cluster tightness trade-off via $\lambda$ , and its numerically stable formulation is beneficial in large-class settings. The approach is robust to moderate hyperparameter variation and scales linearly with class count in both computational and memory overhead.

In summary, Anchor-Based Nearest Class Mean losses provide a principled and efficient approach for endowing learned deep features with explicit clustering structure, leading to enhanced discrimination, stable optimization, and strong open- and closed-set performance (Hao et al., 2018, Miller et al., 2020).

Markdown Report Issue Upgrade to Chat

References (2)

Anchor-based Nearest Class Mean Loss for Convolutional Neural Networks (2018)

Class Anchor Clustering: a Loss for Distance-based Open Set Recognition (2020)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Anchor-Based Nearest Class Mean (NCM) Loss.

Anchor-Based NCM Loss in Deep Learning

1. Definition and Theoretical Foundations

2. Mathematical Formulation and Loss Variants

3. Optimizing Intra-Class and Inter-Class Structure

4. Training Methodology and Implementation

5. Comparative Analysis with Other Metric Learning Losses

6. Empirical Performance and Applications

7. Extensions, Generalizations, and Practical Considerations

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Anchor-Based NCM Loss in Deep Learning

1. Definition and Theoretical Foundations

2. Mathematical Formulation and Loss Variants

3. Optimizing Intra-Class and Inter-Class Structure

4. Training Methodology and Implementation

5. Comparative Analysis with Other Metric Learning Losses

6. Empirical Performance and Applications

7. Extensions, Generalizations, and Practical Considerations

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research