CLAN: Contrastive Learning with Augmented Negatives

Updated 15 September 2025

The paper's main contribution is a framework that leverages augmented negative pairs generated via adversarial, synthetic, or aggressive augmentations to boost contrastive learning performance.
It adapts negative sampling methods to enforce stricter repulsion between dissimilar samples while effectively mitigating false negatives.
CLAN demonstrates enhanced representation quality and robustness across diverse applications, including vision, language, graphs, sensor data, and cybersecurity.

Contrastive Learning using Augmented Negative pairs (CLAN) refers to a class of methodologies in which the selection, generation, and use of augmented negative pairs are central to the training of discriminative and invariant representations through contrastive objectives. By leveraging harder, adaptively sampled, or semantically diverse negatives—often created through adversarial, synthetic, or aggressive augmentation—CLAN frameworks enforce stricter repulsion between dissimilar samples while maintaining alignment between semantically similar pairs, leading to improved convergence, representation robustness, and downstream performance across domains such as vision, language, graphs, sensor time series, and cybersecurity.

1. Foundational Principles of CLAN

The core mechanism of contrastive learning entails maximizing agreement between positive pairs (usually augmented views of the same sample) while minimizing similarity with negative pairs (augmentations or samples from different sources). CLAN frameworks emphasize the construction or mining of "augmented" negative samples—either by feature mixing, adversarial perturbation, strong augmentations, or learned sampling—that are closer to the anchor in the embedding space than randomly sampled negatives, thereby enhancing the informativeness of the contrastive signal (Kalantidis et al., 2020, Dong et al., 2023, Wilkie et al., 8 Sep 2025).

Formally, given an anchor $\mathbf{z}$ , positive $\mathbf{z}^+$ , and a set of negatives $\{\mathbf{z}^-\}$ , a generic NT-Xent style loss is:

$\mathcal{L} = -\log \frac{\exp(\operatorname{sim}(\mathbf{z},\mathbf{z}^+)/\tau)}{\exp(\operatorname{sim}(\mathbf{z},\mathbf{z}^+)/\tau) + \sum_{k}\exp(\operatorname{sim}(\mathbf{z},\mathbf{z}^-_k)/\tau)}$

In CLAN, augmented negative pairs $\mathbf{z}^-_k$ may be generated via adversarial procedures (Ho et al., 2020), feature interpolation (Dong et al., 2023), strong augmentations (Deng et al., 2022), or adaptive mining (Bose et al., 2018), and their construction or weighting is coupled with mechanisms to control for false negatives or domain-specific invariances (Wilkie et al., 8 Sep 2025, Thota et al., 2021).

2. Adaptive and Augmented Negative Sampling

Conventional contrastive approaches often yield "easy" negatives—negatives far from the anchor in representation space, imparting limited learning signal. CLAN strategies augment the negative pool through several means:

Adversarial Sampling: Adaptive segregation of hard negatives via an adversarial generator that maximizes the discriminator's loss (Bose et al., 2018, Ho et al., 2020). For input $x^+$ , the adaptive negative distribution becomes a mixture:

$p^-(y|x^+) = \lambda\, p_{nce}(y) + (1-\lambda)\,g_\theta(y|x^+)$

Here $g_\theta$ is learned to propose negatives that maximize the model's loss (minimax).

Synthetic Hard Negatives: Feature-level mixing of negatives via convex combinations or interpolation (Kalantidis et al., 2020, Dong et al., 2023). For instance, synthetic negatives $h_k$ are generated as

$h_k = \frac{\alpha_k n_i + (1-\alpha_k) n_j}{\|\alpha_k n_i + (1-\alpha_k) n_j\|_2}$

with $n_i$ , $n_j$ being among the hardest negatives by similarity to the anchor.

Strong/Domain-specific Augmentations: Use of augmentations that induce significant distributional shift (e.g., frequency transforms in sensor data, aggressive image transformations as in SACC (Deng et al., 2022)) to create negatives that challenge invariance beyond standard augmentations (Kim et al., 17 Jan 2024).

These mechanisms intentionally "harden" the proxy task, fostering more discriminative boundaries.

3. Managing False Negatives and Sample Mining

A challenge with aggressive negative sampling is the potential inclusion of "false negatives" (semantically similar samples labeled as negatives), which can degrade representation quality. Addressing this, CLAN frameworks incorporate:

False Negative Elimination: Removing or down-weighting negatives with high similarity to the anchor from the loss denominator, for instance by discarding top-K similar negatives (Thota et al., 2021) or by debiasing the loss using prior estimates (Dong et al., 2023).
Potential Sample Mining: Weighting or probabilistically sampling from a set of candidate negatives (or positives), ensuring selected negatives are informative—neither too easy nor too hard, and minimizing inclusion of false negatives (Dong et al., 2023). Potential negatives may be mined using gradient-based criteria or similarity thresholds.

4. Mathematical Models and Loss Design

Mathematical modeling within CLAN emphasizes the use of adaptive, weighted, or smoothed loss formulations:

Weighted Hard Negatives: Loss contributions from each negative are weighted proportionally to similarity (e.g., $w_{z,z'} = \exp(\operatorname{sim}(z,z')/r)$ ), leading to controlled contrast based on hardness (Dong et al., 2023).
Minimax Games: Adversarial negative samplers lead to minimax objectives:

$\min_\omega\, \max_\theta\, \mathbb{E}_{p^+} \Big[ L(\omega, \theta; x^+) \Big]$

where $\theta$ parameterizes the negative generator.

Proximity-aware Losses: In structured domains (e.g., graphs), smoothing techniques (Taubin, bilateral, diffusion) yield soft assignment matrices for positives and negatives, allowing loss regularization by structural/geometric proximity (Behmanesh et al., 23 Feb 2024).
Angular/Geometric Margins: Explicit angular separation between positive/negative embeddings enhances tolerance and discriminability (e.g., Angular Contrastive Loss (Wang et al., 2022)).

5. Empirical Results and Domain Adaptations

CLAN delivers improvements on diverse tasks:

Vision: Synthetic and hard negative approaches yield better accuracy on linear classification, detection, segmentation, and more uniform feature embeddings (Kalantidis et al., 2020, Dong et al., 2023). For clustering, strong augmentation (triplet-view) enhances cluster separability (Deng et al., 2022).
Language and Knowledge Graphs: Adaptive/hard negative sampling enables improved embedding quality, faster convergence, and better metric performance (e.g., mean reciprocal rank in KGs) (Bose et al., 2018).
Audio/Time Series and Sensors: CLAN architectures leveraging time/frequency "two-tower" encoders and domain-specific augmentations show up to 12% AUROC improvement for novelty detection in activity recognition (Kim et al., 17 Jan 2024).
Graphs: Proximity-injected losses (SGCL) outperform existing GCL frameworks for node/graph classification; smoothing lessens penalties for negatives near the decision boundary (Behmanesh et al., 23 Feb 2024).
Cybersecurity: Treating augmentations as negative views uniquely supports modeling benign distributions for network intrusion detection; inference is $O(1)$ by distance from a single centroid, yielding efficient, robust anomaly detectors with top AUROC (Wilkie et al., 8 Sep 2025).

6. Practical Considerations and Resource Implications

The integration of augmented negatives yields both benefits and challenges:

Training Efficiency: Augmented/hard negative strategies provide more informative gradients, accelerating convergence in iterations but with possible increased per-iteration cost (e.g., synthetic sample generation or adversarial optimization).
Computational Cost: Feature-level mixing (hard negative synthesis) introduces minimal additional computation (as with MoCHi), while adversarial or memory-bank methods may require more resources but can be offset by faster convergence.
Representation Quality: Enhanced negative curation mitigates "shortcuts," improves robustness under domain shift/corruption, and produces embeddings beneficial for transfer learning (Ge et al., 2021, Wang et al., 2022).
Deployment: Single-centroid Gaussian models (as in CLAN for intrusion detection) lead to efficient inference—a single distance computation per test sample.

Trade-offs center on balancing hardness, diversity, and computational overhead, with the risk of overfitting or excessive penalization when negative mining is too aggressive or insufficiently debiased.

7. Extensions and Future Directions

Potential avenues for CLAN frameworks include:

Broader Modalities: Extending adversarial/augmented negatives to multimodal contrastive learning, cross-view alignment, and retrieval (Desai et al., 12 Feb 2025).
Robustness to Data Contamination: Evaluating and improving resilience to false positives/negatives and label noise is essential for real-world deployment, especially in anomaly detection (Wilkie et al., 8 Sep 2025).
Automated Pair Curation: Dynamic adjustment of pair selection and weighting via learned algorithms or proximity-aware measures (Dong et al., 2023, Behmanesh et al., 23 Feb 2024).
Integration with Angular/Geometric Losses: Further exploration of margin-based constraints for negative pair separation (Wang et al., 2022).
Efficient Large-scale Learning: Development of scalable memory banks, batched subgraph strategies (for graphs), and efficient neighbor mining techniques are crucial for handling massive datasets (Behmanesh et al., 23 Feb 2024).

CLAN thus embodies a flexible toolkit of methods for augmenting and curating negative examples in contrastive learning, contributing significantly to the representation quality, robustness, and applicability of self-supervised and unsupervised learning systems across domains.