Papers
Topics
Authors
Recent
Search
2000 character limit reached

Oriented Contrastive Denoising Overview

Updated 4 July 2026
  • Oriented contrastive denoising is a paradigm that reframes denoising as a contrastive alignment task using explicit corruption signals.
  • It leverages structured orientation cues—such as masking tokens, noise levels, anatomical consistency, and temporal adjacency—to guide positive and negative pairing.
  • This approach has demonstrated tangible improvements in video-language pre-training, diffusion model robustness, and low-dose CT imaging.

Oriented contrastive denoising denotes a class of objectives in which denoising is not treated solely as direct reconstruction, but as a contrastive alignment problem whose positives and negatives are explicitly directed by a known source of corruption or correspondence. In the literature covered here, that direction is supplied by artificial masking in video–language pre-training, by relative noise level in diffusion models, by anatomy-aware semantic correspondence in low-dose CT, or by adjacency along a probability-flow trajectory. Taken together, these works suggest that the defining property of the approach is not a single architecture, but the use of a structured orientation signal that specifies which noisy and clean states should be pulled together and which mismatched states should be pushed apart (Luo et al., 2021, Wu et al., 2024, Wang et al., 11 Aug 2025, Lei et al., 22 Jan 2025).

1. General formulation and defining characteristics

A compact comparison of the main formulations is given below.

Work Orientation source Contrastive structure
CoCo-BERT artificial [MASK] tokens; true video–sentence pairing masked query matched to paired unmasked cross-modal key and to its own unmasked intra-modal key
Contrastive Diffusion Training different amounts of noise; OOD pair (zζ,β)(z_\zeta,\beta) binary classification between xp0x\sim p_0 and xpζx\sim p_\zeta
ALDEN same anatomy; tissue-specific semantics positive same-coordinate denoised/NDCT pair; negatives from same-coordinate LDCT and cross-location NDCT
rRCM adjacent time-steps on the same PF-ODE trajectory with the same ϵ\epsilon positive pair (xtni,xtn1i)(x_{t_n}^i,x_{t_{n-1}}^i); negatives from other samples in the batch

The shared pattern is that the denoising target is oriented by a relation that is stronger than generic augmentation invariance. In CoCo-BERT, the relevant corruption is the masking procedure itself. In contrastive diffusion training, the orientation is between noisy marginals with different log-SNR values and the associated OOD failure mode. In ALDEN, the alignment relation is anatomical consistency at matched spatial coordinates, supplemented by negatives designed to suppress residual noise and anatomical misplacement. In rRCM, orientation is temporal and dynamical: the positive pair is restricted to adjacent points on the same diffusion trajectory.

This suggests that oriented contrastive denoising is best understood as a design pattern in which denoising supervision is anchored to a known corruption process or semantic relation. The denoising signal can therefore be expressed in embedding space, in cross-modal representation space, or through a classifier-like objective over noise levels, rather than only through pixel-wise regression.

2. Masking-oriented cross-modal denoising in CoCo-BERT

CoCo-BERT introduces “Contrastive Cross-modal matching and denoising” (CoCo) for video–language pre-training. The proxy objective adds a single unified loss to the standard masked-language modeling and masked-sequence-generation objectives. It has two parts: Inter-modal Contrastive Matching (Co-IM), which encourages a masked video or sentence query to match its paired unmasked sentence or video key and to be distinct from cross-modal negatives, and Intra-modal Contrastive Denoising (Co-ID), which encourages a masked video or sentence query to align with its own unmasked video or sentence key and to be distinct from same-modality negatives (Luo et al., 2021).

For one video–sentence pair in a mini-batch, masked inputs are encoded as queries and unmasked inputs as keys. After projection by a small MLP plus attention, the model obtains qV(m),qS(m),kV+,kS+Rdq_V^{(m)}, q_S^{(m)}, k_V^+, k_S^+ \in \mathbb{R}^d, and maintains two cross-batch memory banks of negatives of size KK. With cosine similarity

x,y=(xy)/(xy)\langle x,y\rangle = (x^\top y)/(\|x\|\|y\|)

and s(x,y)=exp(x,y/τ)s(x,y)=\exp(\langle x,y\rangle/\tau), the losses are

LCo-IM=LNCEVS+LNCESV,LCo-ID=LNCEV+LNCES,L_{Co\text{-}IM}=L_{NCE}^{V\to S}+L_{NCE}^{S\to V}, \qquad L_{Co\text{-}ID}=L_{NCE}^{V}+L_{NCE}^{S},

and

xp0x\sim p_00

The architecture uses two query encoders and two key encoders, each a standard Transformer stack with 6 layers. The key encoders have identical architectures but are updated by momentum,

xp0x\sim p_01

with xp0x\sim p_02. A cross-modal decoder with 6 Transformer blocks sits on top of the query outputs to perform MLM and MSG. Two FIFO memories of size xp0x\sim p_03 per modality store recent positive keys as negatives.

The training pipeline masks 15% of frame positions and 15% of word tokens to form the queries while keeping unmasked originals as keys. Query encoders produce xp0x\sim p_04 and xp0x\sim p_05, key encoders produce xp0x\sim p_06 and xp0x\sim p_07, negative sets are built from memory, and the model computes xp0x\sim p_08, xp0x\sim p_09, xpζx\sim p_\zeta0, and xpζx\sim p_\zeta1. Gradients are back-propagated through the query encoders and decoder; the key encoders are updated only by momentum. The reported settings are memory bank size xpζx\sim p_\zeta2 per modality, temperature xpζx\sim p_\zeta3, batch size xpζx\sim p_\zeta4, learning rate xpζx\sim p_\zeta5, up to xpζx\sim p_\zeta6 pre-train epochs, and frame sampling of up to xpζx\sim p_\zeta7 frames on TV and up to xpζx\sim p_\zeta8 frames on ACTION, with frame features from ResNet-152 xpζx\sim p_\zeta9 ϵ\epsilon0 SlowFast ϵ\epsilon1.

The paper explicitly explains why this is an “oriented” form of contrastive denoising. The only noise injected is from the artificial [MASK] tokens, and CoCo “orients” denoising specifically against that noise by contrasting each masked representation with its exact unmasked counterpart. At the same time, cross-modal matching is oriented toward the true pairing: each masked video must match its true unmasked sentence rather than any other sentence, and vice versa. CoCo-BERT was pre-trained on TV and ACTION and was evaluated on cross-modal retrieval, video question answering, and video captioning, where the authors report superiority as a pre-trained structure.

3. Noise-level discrimination and OOD denoising in diffusion models

A related formulation appears in contrastive diffusion training, which begins from the claim that diffusion models implicitly define a log-likelihood ratio between noisy marginals and can therefore be interpreted as hidden noise-level classifiers. The noisy family is written as

ϵ\epsilon2

with log-SNR parameter ϵ\epsilon3. For two noise levels ϵ\epsilon4, the paper defines a log-likelihood ratio ϵ\epsilon5 and emphasizes that standard diffusion training only observes ϵ\epsilon6 on ϵ\epsilon7, whereas evaluations at mismatched noise levels lie outside the training distribution and degrade denoiser quality (Wu et al., 2024).

The proposed self-supervised contrastive diffusion loss (CDL) turns the implicit classifier for noise levels into a training signal. A binary task is defined with ϵ\epsilon8 if ϵ\epsilon9 and (xtni,xtn1i)(x_{t_n}^i,x_{t_{n-1}}^i)0 if (xtni,xtn1i)(x_{t_n}^i,x_{t_{n-1}}^i)1, sampled with equal probability. The loss is

(xtni,xtn1i)(x_{t_n}^i,x_{t_{n-1}}^i)2

By inserting the density-in-terms-of-denoiser form, the objective becomes a contrastive MSE-based loss on pairs at two noise levels. In the reported implementation, each training step samples a real image (xtni,xtn1i)(x_{t_n}^i,x_{t_{n-1}}^i)3, samples (xtni,xtn1i)(x_{t_n}^i,x_{t_{n-1}}^i)4, flips a fair coin (xtni,xtn1i)(x_{t_n}^i,x_{t_{n-1}}^i)5, computes approximate log densities via MSE proxies at SNR (xtni,xtn1i)(x_{t_n}^i,x_{t_{n-1}}^i)6 and at induced noise level (xtni,xtn1i)(x_{t_n}^i,x_{t_{n-1}}^i)7, and backpropagates (xtni,xtn1i)(x_{t_n}^i,x_{t_{n-1}}^i)8. The paper characterizes this as “filling in” supervision on the OOD pair (xtni,xtn1i)(x_{t_n}^i,x_{t_{n-1}}^i)9.

The empirical results target both sequential and parallel sampling. On a parallel sampler with qV(m),qS(m),kV+,kS+Rdq_V^{(m)}, q_S^{(m)}, k_V^+, k_S^+ \in \mathbb{R}^d0 samples, DDPM on CIFAR-10 qV(m),qS(m),kV+,kS+Rdq_V^{(m)}, q_S^{(m)}, k_V^+, k_S^+ \in \mathbb{R}^d1 improves from FID qV(m),qS(m),kV+,kS+Rdq_V^{(m)}, q_S^{(m)}, k_V^+, k_S^+ \in \mathbb{R}^d2 to qV(m),qS(m),kV+,kS+Rdq_V^{(m)}, q_S^{(m)}, k_V^+, k_S^+ \in \mathbb{R}^d3, VP from qV(m),qS(m),kV+,kS+Rdq_V^{(m)}, q_S^{(m)}, k_V^+, k_S^+ \in \mathbb{R}^d4 to qV(m),qS(m),kV+,kS+Rdq_V^{(m)}, q_S^{(m)}, k_V^+, k_S^+ \in \mathbb{R}^d5, VE from qV(m),qS(m),kV+,kS+Rdq_V^{(m)}, q_S^{(m)}, k_V^+, k_S^+ \in \mathbb{R}^d6 to qV(m),qS(m),kV+,kS+Rdq_V^{(m)}, q_S^{(m)}, k_V^+, k_S^+ \in \mathbb{R}^d7, FFHQ qV(m),qS(m),kV+,kS+Rdq_V^{(m)}, q_S^{(m)}, k_V^+, k_S^+ \in \mathbb{R}^d8 VP from qV(m),qS(m),kV+,kS+Rdq_V^{(m)}, q_S^{(m)}, k_V^+, k_S^+ \in \mathbb{R}^d9 to KK0, and FFHQ KK1 VE from KK2 to KK3. On the sequential deterministic EDM sampler with KK4 samples, the gains are modest but consistent, for example VP on FFHQ improves from KK5 to KK6 at NFE KK7, and VE on FFHQ improves from KK8 to KK9 at NFE x,y=(xy)/(xy)\langle x,y\rangle = (x^\top y)/(\|x\|\|y\|)0. In the 2D Dino synthetic example at target MMD x,y=(xy)/(xy)\langle x,y\rangle = (x^\top y)/(\|x\|\|y\|)1, the number of Picard iterations drops from x,y=(xy)/(xy)\langle x,y\rangle = (x^\top y)/(\|x\|\|y\|)2 to x,y=(xy)/(xy)\langle x,y\rangle = (x^\top y)/(\|x\|\|y\|)3, NFE from x,y=(xy)/(xy)\langle x,y\rangle = (x^\top y)/(\|x\|\|y\|)4 to x,y=(xy)/(xy)\langle x,y\rangle = (x^\top y)/(\|x\|\|y\|)5, and wall-time from x,y=(xy)/(xy)\langle x,y\rangle = (x^\top y)/(\|x\|\|y\|)6 to x,y=(xy)/(xy)\langle x,y\rangle = (x^\top y)/(\|x\|\|y\|)7.

Within the broader topic, this work generalizes contrastive denoising beyond paired clean/noisy reconstructions. The orientation is supplied by noise level itself and by the specific OOD discrepancy created when the denoiser is queried off the standard forward path.

4. Anatomy-aware semantic contrastive denoising in low-dose CT

ALDEN formulates an anatomy-aware low-dose CT denoising pipeline built on a GAN backbone with two additional components: an Anatomy-Aware Discriminator (AAD) and a Semantic-Guided Contrastive Learning (SCL) module. The generator adopts ESAU-Net and maps a low-dose CT slice x,y=(xy)/(xy)\langle x,y\rangle = (x^\top y)/(\|x\|\|y\|)8 to x,y=(xy)/(xy)\langle x,y\rangle = (x^\top y)/(\|x\|\|y\|)9 under pixel-wise supervision

s(x,y)=exp(x,y/τ)s(x,y)=\exp(\langle x,y\rangle/\tau)0

where s(x,y)=exp(x,y/τ)s(x,y)=\exp(\langle x,y\rangle/\tau)1 is the paired normal-dose CT. The discriminator is conditioned on hierarchical semantic features extracted from the reference NDCT by a fixed pretrained vision model s(x,y)=exp(x,y/τ)s(x,y)=\exp(\langle x,y\rangle/\tau)2, for example DINOv2 or MedSAM. Three levels of embeddings, s(x,y)=exp(x,y/τ)s(x,y)=\exp(\langle x,y\rangle/\tau)3, are taken from transformer layers s(x,y)=exp(x,y/τ)s(x,y)=\exp(\langle x,y\rangle/\tau)4, s(x,y)=exp(x,y/τ)s(x,y)=\exp(\langle x,y\rangle/\tau)5, and s(x,y)=exp(x,y/τ)s(x,y)=\exp(\langle x,y\rangle/\tau)6, while the discriminator feature maps are denoted s(x,y)=exp(x,y/τ)s(x,y)=\exp(\langle x,y\rangle/\tau)7. At each level, an Attention-based Feature Fusion module cross-attends semantic priors and discriminator features to form anatomy-aware features s(x,y)=exp(x,y/τ)s(x,y)=\exp(\langle x,y\rangle/\tau)8 (Wang et al., 11 Aug 2025).

The adversarial game is

s(x,y)=exp(x,y/τ)s(x,y)=\exp(\langle x,y\rangle/\tau)9

The contrastive component acts on PVM feature embeddings. For each batch of size LCo-IM=LNCEVS+LNCESV,LCo-ID=LNCEV+LNCES,L_{Co\text{-}IM}=L_{NCE}^{V\to S}+L_{NCE}^{S\to V}, \qquad L_{Co\text{-}ID}=L_{NCE}^{V}+L_{NCE}^{S},0, the fixed PVM extracts feature tensors from the LDCT input LCo-IM=LNCEVS+LNCESV,LCo-ID=LNCEV+LNCES,L_{Co\text{-}IM}=L_{NCE}^{V\to S}+L_{NCE}^{S\to V}, \qquad L_{Co\text{-}ID}=L_{NCE}^{V}+L_{NCE}^{S},1, the denoised output LCo-IM=LNCEVS+LNCESV,LCo-ID=LNCEV+LNCES,L_{Co\text{-}IM}=L_{NCE}^{V\to S}+L_{NCE}^{S\to V}, \qquad L_{Co\text{-}ID}=L_{NCE}^{V}+L_{NCE}^{S},2, and the reference NDCT LCo-IM=LNCEVS+LNCESV,LCo-ID=LNCEV+LNCES,L_{Co\text{-}IM}=L_{NCE}^{V\to S}+L_{NCE}^{S\to V}, \qquad L_{Co\text{-}ID}=L_{NCE}^{V}+L_{NCE}^{S},3. At randomly sampled spatial coordinates, ALDEN constructs one positive set and two negative sets: same-location denoised/NDCT pairs preserve structure, same-location denoised/LDCT pairs penalize residual noise, and different-location denoised/NDCT pairs penalize anatomical misalignment. The InfoNCE-style objective is

LCo-IM=LNCEVS+LNCESV,LCo-ID=LNCEV+LNCES,L_{Co\text{-}IM}=L_{NCE}^{V\to S}+L_{NCE}^{S\to V}, \qquad L_{Co\text{-}ID}=L_{NCE}^{V}+L_{NCE}^{S},4

with cosine similarities and LCo-IM=LNCEVS+LNCESV,LCo-ID=LNCEV+LNCES,L_{Co\text{-}IM}=L_{NCE}^{V\to S}+L_{NCE}^{S\to V}, \qquad L_{Co\text{-}ID}=L_{NCE}^{V}+L_{NCE}^{S},5.

The total objective is

LCo-IM=LNCEVS+LNCESV,LCo-ID=LNCEV+LNCES,L_{Co\text{-}IM}=L_{NCE}^{V\to S}+L_{NCE}^{S\to V}, \qquad L_{Co\text{-}ID}=L_{NCE}^{V}+L_{NCE}^{S},6

with LCo-IM=LNCEVS+LNCESV,LCo-ID=LNCEV+LNCES,L_{Co\text{-}IM}=L_{NCE}^{V\to S}+L_{NCE}^{S\to V}, \qquad L_{Co\text{-}ID}=L_{NCE}^{V}+L_{NCE}^{S},7 and LCo-IM=LNCEVS+LNCESV,LCo-ID=LNCEV+LNCES,L_{Co\text{-}IM}=L_{NCE}^{V\to S}+L_{NCE}^{S\to V}, \qquad L_{Co\text{-}ID}=L_{NCE}^{V}+L_{NCE}^{S},8. Training uses Adam with LCo-IM=LNCEVS+LNCESV,LCo-ID=LNCEV+LNCES,L_{Co\text{-}IM}=L_{NCE}^{V\to S}+L_{NCE}^{S\to V}, \qquad L_{Co\text{-}ID}=L_{NCE}^{V}+L_{NCE}^{S},9, xp0x\sim p_000, learning rate xp0x\sim p_001, batch size xp0x\sim p_002, and xp0x\sim p_003 iterations.

The reported quantitative results are explicit. On Mayo2016, ALDEN-DINOv2 achieves PSNR xp0x\sim p_004 dB, SSIM xp0x\sim p_005, RMSE xp0x\sim p_006, and LPIPS xp0x\sim p_007. On the in-house MCTD dataset, ALDEN-DINOv2 leads in SSIM xp0x\sim p_008, RMSE xp0x\sim p_009, and LPIPS xp0x\sim p_010. The Mayo2016 ablation study reports the following progression: the baseline ESAU-Net+GAN yields PSNR xp0x\sim p_011, SSIM xp0x\sim p_012, RMSE xp0x\sim p_013, LPIPS xp0x\sim p_014; adding AAD only gives xp0x\sim p_015, xp0x\sim p_016, xp0x\sim p_017, xp0x\sim p_018; adding SCL only gives xp0x\sim p_019, xp0x\sim p_020, xp0x\sim p_021, xp0x\sim p_022; and ALDEN with both components gives xp0x\sim p_023, xp0x\sim p_024, xp0x\sim p_025, xp0x\sim p_026. On the downstream multi-organ segmentation task with TotalSegmentator’s test set of xp0x\sim p_027 CTs and xp0x\sim p_028 organs, the Dice score is xp0x\sim p_029 in the low-noise scenario and xp0x\sim p_030 in the high-noise scenario, the latter reported as best with xp0x\sim p_031 over the next best method.

In this formulation, orientation is semantic and spatial. Positive pairing is restricted to matched anatomy at the same coordinates, while the dual negatives explicitly target residual LDCT noise and cross-location anatomical mismatch. The paper presents this as a way to preserve tissue-specific patterns without requiring manual segmentation labels.

5. Trajectory-oriented latent denoising in robust representation consistency models

rRCM reformulates denoising along diffusion trajectories as a discriminative latent-space problem connected to randomized smoothing. The underlying forward SDE is

xp0x\sim p_032

with probability-flow ODE

xp0x\sim p_033

After discretization at times xp0x\sim p_034, noisy points satisfy xp0x\sim p_035, and rRCM uses instance discrimination to align temporally adjacent points along the same trajectory (Lei et al., 22 Jan 2025).

The encoder is a Vision Transformer with time embedding, denoted xp0x\sim p_036, followed by a linear head xp0x\sim p_037 for logits and, during pre-training only, a 3-layer MLP projector xp0x\sim p_038. With normalized embeddings, the oriented consistency term is

xp0x\sim p_039

Here the positive pair is xp0x\sim p_040, where both points use the same Gaussian noise xp0x\sim p_041 and differ only by adjacent time-step. In parallel, the model applies a standard augmentation contrastive loss using two augmented views of the clean image. The joint pre-training objective minimizes the sum of the consistency and augmentation contrastive terms.

The paper defines orientation very narrowly: because the PF-ODE yields a unique continuous trajectory for each clean image, rRCM draws positives only among adjacent time-steps on the same trajectory. The ablations state that pairing points with different xp0x\sim p_042 or pairing non-adjacent time-steps breaks that orientation and yields inferior alignment. Model sizes reported for ImageNet are rRCM-S with xp0x\sim p_043M parameters, rRCM-B with xp0x\sim p_044M, and rRCM-B-Deep with xp0x\sim p_045M. The temperature is xp0x\sim p_046 for both consistency and augmentation losses. Pre-training uses ImageNet for xp0x\sim p_047k steps with batch size xp0x\sim p_048 and CIFAR-10 for xp0x\sim p_049k steps with batch size xp0x\sim p_050, using AdamW with learning rate xp0x\sim p_051; fine-tuning uses xp0x\sim p_052 epochs on ImageNet and xp0x\sim p_053 epochs on CIFAR-10.

For randomized smoothing, once the classifier is fixed, the smoothed classifier is

xp0x\sim p_054

and the certified radius is

xp0x\sim p_055

The paper reports that the method outperforms the certified accuracy of diffusion-based methods on ImageNet across all perturbation radii by xp0x\sim p_056 on average, with up to xp0x\sim p_057 at larger radii, while reducing inference costs by xp0x\sim p_058 on average. On ImageNet, rRCM-B reports certified accuracies xp0x\sim p_059 at radii xp0x\sim p_060, with latency xp0x\sim p_061 s / xp0x\sim p_062 s†, while rRCM-B-Deep reports xp0x\sim p_063 with latency xp0x\sim p_064 m xp0x\sim p_065 s. On CIFAR-10, rRCM-B reports latency xp0x\sim p_066 s and certified accuracies xp0x\sim p_067 at radii xp0x\sim p_068.

This work places oriented contrastive denoising in a robustness setting rather than a generative one. The denoising effect is implicit: instead of explicitly reconstructing a clean sample, the model learns representation consistency as one moves backward along the diffusion path, enabling one-shot denoising-and-classification.

6. Conceptual scope, common misconceptions, and open directions

The literature does not present a single universal objective for oriented contrastive denoising. Instead, each formulation instantiates orientation differently: masking noise and true cross-modal pairing in CoCo-BERT, noise-level discrimination and OOD correction in contrastive diffusion training, anatomy-aware spatial semantics in ALDEN, and adjacent same-trajectory consistency in rRCM (Luo et al., 2021, Wu et al., 2024, Wang et al., 11 Aug 2025, Lei et al., 22 Jan 2025).

A common misconception is that contrastive denoising is equivalent to generic augmentation-based self-supervision. The cited formulations do not support that interpretation. Their positives are exact unmasked counterparts, clean-versus-extra-noisy samples, same-location anatomical matches, or adjacent points on the same diffusion trajectory. Their negatives are likewise structured: FIFO memory-bank negatives from other videos or sentences, mismatched noise levels, same-location LDCT residuals, cross-location anatomy mismatches, or other samples in the batch. This suggests that the critical ingredient is not contrastive learning in the abstract, but the specification of a domain-valid orientation relation.

A second misconception is that denoising must be expressed only in pixel space. CoCo-BERT performs denoising at the sequence-representation level while coupling it to cross-modal matching. ALDEN applies contrastive loss to pretrained semantic feature embeddings and couples it to an anatomy-aware discriminator. rRCM operates on latent representations of a time-conditioned ViT. Contrastive diffusion training derives a classifier-like objective from log-likelihood ratios between noisy marginals. The underlying denoising mechanism is therefore representation-level in several of the reported systems.

The open issues named in the literature are also domain-specific. ALDEN identifies dependence on PVM domain gap, compute overhead from cross-attention in the discriminator and contrastive sampling, the possibility that random negative sampling may miss rare tissue patterns, the extension from 2D slices to full 3D volumes, and joint finetuning of the PVM as future directions (Wang et al., 11 Aug 2025). Contrastive diffusion training isolates denoiser degradation in regions far outside the training distribution as a core sampling problem, especially for parallel sampling (Wu et al., 2024). rRCM shows that mis-specified orientation—non-adjacent time-steps or different noise realizations—reduces performance (Lei et al., 22 Jan 2025). CoCo-BERT begins from the argument that masked inputs “would inevitably introduce noise for cross-modal matching proxy task,” motivating a denoising objective specifically oriented to that masking process (Luo et al., 2021).

Across these works, oriented contrastive denoising emerges as a framework for turning known structure in corruption, pairing, or dynamics into a discriminative training signal. The orientation signal determines what counts as faithful recovery, whether that recovery is defined as true video–sentence alignment, low-OOD denoiser behavior, anatomical consistency, or invariant representation along a diffusion trajectory.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Oriented Contrastive Denoising.