DNA: Denoised Neighborhood Aggregation for Fine-grained Category Discovery
Abstract: Discovering fine-grained categories from coarsely labeled data is a practical and challenging task, which can bridge the gap between the demand for fine-grained analysis and the high annotation cost. Previous works mainly focus on instance-level discrimination to learn low-level features, but ignore semantic similarities between data, which may prevent these models learning compact cluster representations. In this paper, we propose Denoised Neighborhood Aggregation (DNA), a self-supervised framework that encodes semantic structures of data into the embedding space. Specifically, we retrieve k-nearest neighbors of a query as its positive keys to capture semantic similarities between data and then aggregate information from the neighbors to learn compact cluster representations, which can make fine-grained categories more separatable. However, the retrieved neighbors can be noisy and contain many false-positive keys, which can degrade the quality of learned embeddings. To cope with this challenge, we propose three principles to filter out these false neighbors for better representation learning. Furthermore, we theoretically justify that the learning objective of our framework is equivalent to a clustering loss, which can capture semantic similarities between data to form compact fine-grained clusters. Extensive experiments on three benchmark datasets show that our method can retrieve more accurate neighbors (21.31% accuracy improvement) and outperform state-of-the-art models by a large margin (average 9.96% improvement on three metrics). Our code and data are available at https://github.com/Lackel/DNA.
- Mitigating class-boundary label uncertainty to reduce both model bias and variance. ACM Transactions on Knowledge Discovery from Data (TKDD), 15(2):1–18.
- Fine-grained category discovery under coarse-grained supervision with hierarchical weighted self-contrastive learning. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing.
- New user intent discovery with robust pseudo label training and source domain joint-training. IEEE Intelligent Systems.
- Generalized category discovery with decoupled prototypical network. arXiv preprint arXiv:2211.15115.
- Classifying the unknown: Insect identification with deep hierarchical bayesian learning. Methods in Ecology and Evolution.
- Fine-grained angular contrastive learning with coarse labels. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8730–8740.
- Deep clustering for unsupervised learning of visual features. In Proceedings of the European Conference on Computer Vision (ECCV), pages 132–149.
- A simple framework for contrastive learning of visual representations. In International conference on machine learning, pages 1597–1607. PMLR.
- Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
- Discriminative unsupervised feature learning with convolutional neural networks. Advances in neural information processing systems, 27:766–774.
- With a little help from my friends: Nearest-neighbor contrastive learning of visual representations. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 9588–9597.
- Jerome H Friedman. 1994. An overview of predictive learning and function approximation. Springer.
- Simcse: Simple contrastive learning of sentence embeddings. arXiv preprint arXiv:2104.08821.
- Automatically discovering and learning new visual categories with ranking statistics. arXiv preprint arXiv:2002.05714.
- Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9729–9738.
- Lawrence Hubert and Phipps Arabie. 1985. Comparing partitions. Journal of classification, 2(1):193–218.
- Hdltex: Hierarchical deep learning for text classification. In Machine Learning and Applications (ICMLA), 2017 16th IEEE International Conference on. IEEE.
- Harold W Kuhn. 1955. The hungarian method for the assignment problem. Naval research logistics quarterly, 2:83–97.
- Detecting the overlapping and hierarchical community structure in complex networks. New journal of physics, 11(3):033015.
- An evaluation dataset for intent classification and out-of-scope prediction. arXiv preprint arXiv:1909.02027.
- Prototypical contrastive learning of unsupervised representations. arXiv preprint arXiv:2005.04966.
- Benchmarking natural language understanding services for building conversational agents. In Increasing Naturalness and Flexibility in Spoken Dialogue Interaction. Springer.
- Coarse2fine: Fine-grained text classification on coarsely-grained annotated data. arXiv preprint arXiv:2109.10856.
- Fine-grained sentiment classification using bert. In 2019 Artificial Intelligence for Transforming Business and Society (AITB), volume 1, pages 1–5. IEEE.
- Neural prototype trees for interpretable fine-grained image recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14933–14943.
- Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748.
- Hello neighbor: Accurate object retrieval with k-reciprocal nearest neighbors. In CVPR 2011, pages 777–784. IEEE.
- Probability models for open set recognition. IEEE transactions on pattern analysis and machine intelligence, 36(11):2317–2324.
- Varsha Suresh and Desmond C Ong. 2021. Not all negatives are equal: Label-aware contrastive loss for fine-grained text classification. arXiv preprint arXiv:2109.05427.
- Towards fine-grained classification of climate change related social media text. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, pages 434–443.
- Generalized category discovery. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7492–7501.
- Tongzhou Wang and Phillip Isola. 2020. Understanding contrastive representation learning through alignment and uniformity on the hypersphere. In International Conference on Machine Learning, pages 9929–9939. PMLR.
- Fine-grained image analysis with deep learning: A survey. IEEE transactions on pattern analysis and machine intelligence, 44(12):8927–8948.
- Clear: Contrastive learning for sentence representation. arXiv preprint arXiv:2012.15466.
- Self-labeling framework for novel category discovery over domains. In Proceedings of the AAAI Conference on Artificial Intelligence.
- Discovering new intents with deep aligned clustering. In Proceedings of the AAAI Conference on Artificial Intelligence.
- New intent discovery with pre-training and contrastive learning. arXiv preprint arXiv:2205.12914.
- Neighborhood contrastive learning for novel class discovery. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10867–10875.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.