CDAD-Net: Bridging Domain Gaps in Generalized Category Discovery (2404.05366v1)
Abstract: In Generalized Category Discovery (GCD), we cluster unlabeled samples of known and novel classes, leveraging a training dataset of known classes. A salient challenge arises due to domain shifts between these datasets. To address this, we present a novel setting: Across Domain Generalized Category Discovery (AD-GCD) and bring forth CDAD-NET (Class Discoverer Across Domains) as a remedy. CDAD-NET is architected to synchronize potential known class samples across both the labeled (source) and unlabeled (target) datasets, while emphasizing the distinct categorization of the target data. To facilitate this, we propose an entropy-driven adversarial learning strategy that accounts for the distance distributions of target samples relative to source-domain class prototypes. Parallelly, the discriminative nature of the shared space is upheld through a fusion of three metric learning objectives. In the source domain, our focus is on refining the proximity between samples and their affiliated class prototypes, while in the target domain, we integrate a neighborhood-centric contrastive learning mechanism, enriched with an adept neighborsmining approach. To further accentuate the nuanced feature interrelation among semantically aligned images, we champion the concept of conditional image inpainting, underscoring the premise that semantically analogous images prove more efficacious to the task than their disjointed counterparts. Experimentally, CDAD-NET eclipses existing literature with a performance increment of 8-15% on three AD-GCD benchmarks we present.
- Domain-adversarial neural networks. arXiv preprint arXiv:1412.4446, 2014.
- Learning class and domain augmentations for single-source open-domain generalization. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 1816–1826, January 2024.
- Universal representations: The missing link between faces, text, planktons, and cat breeds. arXiv preprint arXiv:1701.07275, 2017.
- Beyond boundaries: A novel data-augmentation discourse for open domain generalization. Transactions on Machine Learning Research, 2023.
- Open-world semi-supervised learning. ArXiv, abs/2102.03526, 2021.
- Unsupervised learning of visual features by contrasting cluster assignments. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 9912–9924. Curran Associates, Inc., 2020.
- Emerging properties in self-supervised vision transformers. In Proceedings of the IEEE/CVF international conference on computer vision, pages 9650–9660, 2021.
- Generative pretraining from pixels. In International conference on machine learning, pages 1691–1703. PMLR, 2020.
- A simple framework for contrastive learning of visual representations. CoRR, abs/2002.05709, 2020.
- Parametric information maximization for generalized category discovery. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 1729–1739, 2023.
- Hal Daumé III. Frustratingly easy domain adaptation. arXiv preprint arXiv:0907.1815, 2009.
- Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009.
- An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
- With a little help from my friends: Nearest-neighbor contrastive learning of visual representations. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 9588–9597, 2021.
- A density-based algorithm for discovering clusters in large spatial databases with noise. In kdd, volume 96, pages 226–231, 1996.
- Self-supervised representation learning from multi-domain data. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3245–3255, 2019.
- A unified objective for novel class discovery. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 9284–9292, October 2021.
- Unsupervised domain adaptation by backpropagation. In International conference on machine learning, pages 1180–1189. PMLR, 2015.
- Self-supervised pretraining of visual features in the wild. arXiv preprint arXiv:2103.01988, 2021.
- In search of lost domain generalization. arXiv preprint arXiv:2007.01434, 2020.
- Automatically discovering and learning new visual categories with ranking statistics. ArXiv, abs/2002.05714, 2020.
- Autonovel: Automatically discovering and learning novel visual categories. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(10):6767–6781, Oct. 2022.
- Learning to discover novel visual categories via deep transfer clustering. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pages 8400–8408, 2019.
- A survey on vision transformer. IEEE transactions on pattern analysis and machine intelligence, 2022.
- Momentum contrast for unsupervised visual representation learning. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 9726–9735, 2020.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
- Learning to cluster in order to transfer across domains and tasks. ArXiv, abs/1711.10125, 2017.
- Multi-class classification without multi-class labels. ArXiv, abs/1901.00544, 2019.
- Multi-domain self-supervised learning, 2022.
- Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
- Learning multiple layers of features from tiny images. 2009.
- Harold W Kuhn. The hungarian method for the assignment problem. Naval research logistics quarterly, 2(1-2):83–97, 1955.
- Deeper, broader and artier domain generalization. In Proceedings of the IEEE international conference on computer vision, pages 5542–5550, 2017.
- Maximum density divergence for domain adaptation. IEEE transactions on pattern analysis and machine intelligence, 43(11):3918–3930, 2020.
- Efficient multi-domain learning by covariance normalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5424–5433, 2019.
- Conditional adversarial domain adaptation. Advances in neural information processing systems, 31, 2018.
- J MacQueen. Some methods for classification and analysis of multivariate observations. In Proc. 5th Berkeley Symposium on Math., Stat., and Prob, page 281, 1965.
- Unsupervised learning of visual representations by solving jigsaw puzzles. In European conference on computer vision, pages 69–84. Springer, 2016.
- Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748, 2018.
- Open set domain adaptation. In Proceedings of the IEEE international conference on computer vision, pages 754–763, 2017.
- Context encoders: Feature learning by inpainting. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2536–2544, 2016.
- Moment matching for multi-source domain adaptation. In Proceedings of the IEEE/CVF international conference on computer vision, pages 1406–1415, 2019.
- Dynamic conceptional contrastive learning for generalized category discovery. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7579–7588, 2023.
- Multi-source open-set deep adversarial domain adaptation. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXVI 16, pages 735–750. Springer, 2020.
- Learning multiple visual domains with residual adapters. Advances in neural information processing systems, 30, 2017.
- You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 779–788, 2016.
- Model-based domain generalization. Advances in Neural Information Processing Systems, 34:20210–20229, 2021.
- Open set domain adaptation by backpropagation. In Proceedings of the European conference on computer vision (ECCV), pages 153–168, 2018.
- Grad-cam: Visual explanations from deep networks via gradient-based localization. International Journal of Computer Vision, 128(2), 2019.
- Open domain generalization with domain-augmented meta-learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9624–9633, 2021.
- Laurens Van der Maaten and Geoffrey Hinton. Visualizing data using t-sne. Journal of machine learning research, 9(11), 2008.
- A survey on semi-supervised learning. Machine learning, 109(2):373–440, 2020.
- Generalized category discovery. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7492–7501, 2022.
- Deep hashing network for unsupervised domain adaptation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 5018–5027, 2017.
- Parametric classification for generalized category discovery: A baseline study. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 16590–16600, 2023.
- Self-labeling framework for novel category discovery over domains. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 3161–3169, 2022.
- Split-brain autoencoders: Unsupervised learning by cross-channel prediction. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1058–1067, 2017.
- Promptcal: Contrastive affinity learning via auxiliary prompts for generalized novel category discovery. ArXiv, abs/2212.05590, 2022.
- Novel visual category discovery with dual ranking statistics and mutual knowledge distillation. In Neural Information Processing Systems, 2021.
- Learning semi-supervised gaussian mixture models for generalized category discovery, 2023.
- Neighborhood contrastive learning for novel class discovery. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10867–10875, 2021.
- Sai Bhargav Rongali (3 papers)
- Sarthak Mehrotra (5 papers)
- Ankit Jha (19 papers)
- Mohamad Hassan N C (3 papers)
- Shirsha Bose (6 papers)
- Tanisha Gupta (1 paper)
- Mainak Singha (20 papers)
- Biplab Banerjee (63 papers)