- The paper introduces pairing regularization to address many-to-one collapse by enforcing latent-sample consistency.
- It combines a contrastive loss with standard adversarial methods to achieve higher coverage and intra-mode diversity, even in collapse-prone settings.
- Experiments on synthetic data and CIFAR-10 demonstrate improved mapping fidelity and scalable integration with existing GAN stabilization techniques.
Pairing Regularization for Mitigating Many-to-One Collapse in GANs
Introduction and Motivation
Despite extensive innovation in generative adversarial networks (GANs), mode collapse remains an unsolved and critical challenge, impeding the development of models that are not only high-fidelity but also diverse. The canonical distinction in the literature focuses on inter-mode collapse—mode dropping—where the generator fails to cover the full support of the data distribution. However, structural deficiencies in GAN training persist even when mode coverage is apparent. Specifically, intra-mode collapse, herein termed "many-to-one collapse," refers to the phenomenon where considerable portions of the latent space are mapped to identical or highly similar regions in data space. This results in reduced intra-mode diversity, exposing a gap in conventional regularization and evaluation methods.
The work introduces pairing regularization as a direct mechanism to address the many-to-one collapse. This approach complements established stabilization techniques (e.g., gradient penalties, data augmentation) by explicitly enforcing a local correspondence between latent variables and generated samples, incentivizing the generator to preserve latent variation in the sample space rather than simply maximizing support coverage.
Failure Modes in GANs
Classical efforts aimed to mitigate inter-mode collapse via gradient penalties, spectral normalization, and data augmentation, resulting in GANs with higher recall and more complete coverage of the data manifold. However, these techniques do not address the mapping's structure from latent to data space, as evidenced by persistent concentration of probability mass and limited diversity within modes.



Figure 1: Two-dimensional Gaussian mixture experiments reveal that gradient penalties prevent mode dropping but fail to overcome intra-mode collapse, whereas pairing regularization distributes samples more uniformly and, in tandem with gradient penalties, attains both mode coverage and intra-mode diversity.
A salient diagnostic is provided by coverage metrics which, unlike recall, are sensitive to the distribution's uniformity and sample diversity. The distinction is further demonstrated on continuous manifolds such as ring distributions, where all baseline methods achieve similar recall but only pairing regularization provides uniform coverage.



Figure 2: For ring-shaped target distributions, pairing regularization alone or in conjunction with gradient penalties leads to uniform sample distribution, indicating mitigation of many-to-one collapse.
Pairing regularization introduces a contrastive identity objective directly into the generator's update. Given B latent vectors zi and their generated samples Gθ(zi), the model constructs a matching task: discriminating true latent-sample pairs among all possible (including negative) pairings within the minibatch. The pairing loss is:
Lpair=B1i=1∑B[−log∑j=1Bexp(ℓij)exp(ℓii)]
where ℓij denotes pairwise similarity scores with temperature τ. This loss is backpropagated only into the generator and a small auxiliary pairing network, ensuring that for each generated image, its latent code remains locally identifiable within the batch. This penalizes many-to-one mappings structurally, rather than at the distributional level.
The overall training objective augments the standard adversarial loss with the pairing regularization weighted by λpair, allowing harmonization with any stabilization regime (e.g., R1 gradient penalty, ADA). Importantly, the pairing module operates orthogonally to the discriminator, requiring no architectural modification.
Experimental Results
Synthetic Data: Diagnosing Collapse
Controlled experiments on toy distributions (e.g., 2D Gaussian mixtures, ring, Gaussian grid) reveal the structural impact of pairing regularization. Unlike precision-recall metrics, coverage reveals clear separation between standard stabilization methods and those augmented with pairing regularization. The latter achieve notably higher coverage scores, reflecting robust mapping fidelity across the latent’s support.



Figure 3: On a 25-mode 2D Gaussian grid, pairing regularization enhances both global mode coverage and intra-mode diversity, overcoming the limitations of solely stabilization-based approaches.
Further ablation with alternatives like relativistic GAN losses and mode-seeking regularization (MS-GAN) shows that only pairing directly mitigates many-to-one collapse, achieving balanced improvements in both precision and coverage.
CIFAR-10: Regimes With and Without Augmentation
Extending to class-conditional CIFAR-10, the utility of pairing regularization is evident in collapse-prone regimes (e.g., no data augmentation, strong gradient penalty). Pairing-GAN achieves superior coverage (0.731 vs. 0.637 for StyleGAN2) with competitive precision and recall, highlighting its effectiveness in enhancing intra-class diversity, not captured by recall alone.

Figure 4: Pairing-GAN exhibits higher coverage during training, particularly in the absence of data augmentation, thereby directly reducing many-to-one collapse effects.
In augmentation-stabilized training (e.g., ADA-enabled), all methods converge to similar precision, recall, and coverage, although pairing provides marginal improvements in precision at each evaluation point.

Figure 5: Training with ADA results in convergence of all methods in terms of precision, recall, and coverage, with pairing regularization slightly increasing precision throughout training.
Theoretical and Practical Implications
This research underscores the importance of analyzing GAN failures beyond support-based diagnostics. While inter-mode collapse can be largely handled via discriminator-centric regularization and architectural advances, persistent many-to-one collapse—manifesting as local degeneracies in the generator mapping—is only addressed through direct generator-side constraints. Pairing regularization stands out as an easily integrable, architectural-agnostic method enhancing intra-mode diversity.
Practically, pairing regularization is especially useful under limited data or collapse-prone scenarios, ensuring generators maintain high-fidelity latent-to-data correspondence rather than defaulting to nearest-mode covering behavior. Even under heavy-stabilization (ADA), the architectural tendency toward many-to-one mappings remains masked but unresolved; hence, coverage metrics and structural regularization should become more standard.
Future Directions
Potential research directions include:
- Integrating pairing regularization into diffusion models and autoregressive generators.
- Extending the regularizer for continuous, high-dimensional manifolds beyond classification/image-space tasks.
- Studying synergies with diversity-inducing priors or mutual information maximization objectives for fully latent-disentangled representations.
- Applying coverage-sensitive regularization to text, speech, and multimodal generation tasks where latent collapse is harder to diagnose.
Conclusion
Pairing regularization provides a principled solution to many-to-one collapse in GANs, bridging a critical gap left by stabilization and diversity-encouraging methods. It structurally enforces robust latent-to-sample correspondences, improving intra-mode diversity and overall generative performance, especially in challenging training settings. This methodological advance suggests that further progress in generative modeling demands not only more stable training but also direct pressure on the generator mapping’s structure to ensure models learn meaningful, diverse representations aligned with the full complexity of the data distribution.
Figure 6: Pairing-GAN generated image samples on conditional CIFAR-10, reflecting enhanced intra-class diversity and mapping fidelity induced by coupling latent and sample spaces.