Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
GPT-5.1
GPT-5.1 91 tok/s
Gemini 3.0 Pro 46 tok/s Pro
Gemini 2.5 Flash 148 tok/s Pro
Kimi K2 170 tok/s Pro
Claude Sonnet 4.5 34 tok/s Pro
2000 character limit reached

Simple-Sampling and Hard-Mixup with Prototypes to Rebalance Contrastive Learning for Text Classification (2405.11524v1)

Published 19 May 2024 in cs.CL

Abstract: Text classification is a crucial and fundamental task in natural language processing. Compared with the previous learning paradigm of pre-training and fine-tuning by cross entropy loss, the recently proposed supervised contrastive learning approach has received tremendous attention due to its powerful feature learning capability and robustness. Although several studies have incorporated this technique for text classification, some limitations remain. First, many text datasets are imbalanced, and the learning mechanism of supervised contrastive learning is sensitive to data imbalance, which may harm the model performance. Moreover, these models leverage separate classification branch with cross entropy and supervised contrastive learning branch without explicit mutual guidance. To this end, we propose a novel model named SharpReCL for imbalanced text classification tasks. First, we obtain the prototype vector of each class in the balanced classification branch to act as a representation of each class. Then, by further explicitly leveraging the prototype vectors, we construct a proper and sufficient target sample set with the same size for each class to perform the supervised contrastive learning procedure. The empirical results show the effectiveness of our model, which even outperforms popular LLMs across several datasets.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (59)
  1. AI@Meta. Llama 3 model card. 2024.
  2. Deep over-sampling framework for classifying imbalanced data. In ECMLPKDD, pages 770–785, 2017.
  3. What is the effect of importance weighting in deep learning? In ICML, pages 872–881, 2019.
  4. Learning imbalanced datasets with label-distribution-aware margin loss. NeurIPS, 2019.
  5. A survey on evaluation of large language models. arXiv preprint arXiv:2307.03109, 2023.
  6. Smote: synthetic minority over-sampling technique. JAIR, 16:321–357, 2002.
  7. A simple framework for contrastive learning of visual representations. In ICML, pages 1597–1607, 2020.
  8. Dual contrastive learning: Text classification via label-aware data augmentation. arXiv preprint arXiv:2201.08702, 2022.
  9. Class-balanced loss based on effective number of samples. In CVPR, pages 9268–9277, 2019.
  10. Bert: Pre-training of deep bidirectional transformers for language understanding. In NAACL, 2019.
  11. A holistic lexicon-based approach to opinion mining. In WSDM, pages 231–240, 2008.
  12. Fine-tuning pretrained language models: Weight initializations, data orders, and early stopping. arXiv preprint arXiv:2002.06305, 2020.
  13. Understanding back-translation at scale. In EMNLP, pages 489–500, 2018.
  14. SimCSE: Simple contrastive learning of sentence embeddings. In EMNLP, 2021.
  15. Supervised contrastive learning for pre-trained language model fine-tuning. In ICLR, 2021.
  16. Momentum contrast for unsupervised visual representation learning. In CVPR, pages 9729–9738, 2020.
  17. Benchmarking neural network robustness to common corruptions and perturbations. In ICLR, 2020.
  18. Lora: Low-rank adaptation of large language models. In ICLR, 2021.
  19. Detecting out-of-distribution data through in-distribution class prior. In ICML, 2023.
  20. Decoupling representation and classifier for long-tailed recognition. In ICLR, 2020.
  21. Supervised contrastive learning. NeurIPS, pages 18661–18673, 2020.
  22. Yoon Kim. Convolutional neural networks for sentence classification. In EMNLP, pages 1746–1751, 2014.
  23. Learning question classifiers. In COLING, 2002.
  24. A survey on text classification: From shallow to deep learning. arXiv preprint arXiv:2008.00364, 2020.
  25. Dice loss for data-imbalanced nlp tasks. In ACL, pages 465–476, 2020.
  26. Focal loss for dense object detection. In ICCV, pages 2980–2988, 2017.
  27. Multi-timescale long short-term memory neural network for modelling sentences and documents. In EMNLP, pages 2326–2335, 2015.
  28. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692, 2019.
  29. Deep attention diffusion graph neural networks for text classification. In EMNLP, pages 8142–8152, 2021.
  30. Few-shot node classification on attributed networks with graph meta-learning. In SIGIR, pages 471–481, 2022.
  31. Local and global: Temporal question answering via information fusion. In IJCAI, 2023.
  32. Time-aware multiway adaptive fusion network for temporal knowledge graph question answering. In ICASSP, 2023.
  33. A simple but effective approach for unsupervised few-shot graph classification. In WWW, 2024.
  34. Improved graph contrastive learning for short text classification. In AAAI, 2024.
  35. Long-tail learning via logit adjustment. In ICLR, 2020.
  36. V-net: Fully convolutional neural networks for volumetric medical image segmentation. In 3DV, pages 565–571, 2016.
  37. Training language models to follow instructions with human feedback. In NeurIPS, 2022.
  38. Contrastive learning with hard negative samples. In ICLR, 2020.
  39. Bloom: A 176b-parameter open-access multilingual language model. arXiv preprint arXiv:2211.05100, 2022.
  40. Supervised prototypical contrastive learning for emotion recognition in conversation. In EMNLP, 2022.
  41. Training convolutional networks with noisy labels. In ICLR, 2015.
  42. Not all negatives are equal: Label-aware contrastive loss for fine-grained text classification. In EMNLP, pages 4381–4394, 2021.
  43. Pte: Predictive text embedding through large-scale heterogeneous text networks. In SIGKDD, pages 1165–1174, 2015.
  44. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, 2023.
  45. Attention is all you need. In NeurIPS, 2017.
  46. Attention-based lstm for aspect-level sentiment classification. In EMNLP, pages 606–615, 2016.
  47. Eda: Easy data augmentation techniques for boosting performance on text classification tasks. In EMNLP, pages 6382–6388, 2019.
  48. Data noising as smoothing in neural network language models. In ICLR, 2017.
  49. Self-taught convolutional neural networks for short text clustering. Neural Networks, 88:22–31, 2017.
  50. On multi-domain long-tailed recognition, generalization and beyond. In ECCV, 2022.
  51. Graph convolutional networks for text classification. In AAAI, pages 7370–7377, 2019.
  52. Cutmix: Regularization strategy to train strong classifiers with localizable features. In ICCV, 2019.
  53. Generalized cross entropy loss for training deep neural networks with noisy labels. Advances in neural information processing systems, 31, 2018.
  54. mixup: Beyond empirical risk minimization. In ICLR, 2018.
  55. Deep long-tailed learning: A survey. arXiv preprint arXiv:2110.04596, 2021.
  56. Empowering long-tail item recommendation through cross decoupling network (cdn). In SIGKDD, 2023.
  57. Imbalanced label distribution learning. In AAAI, 2023.
  58. Bbn: Bilateral-branch network with cumulative learning for long-tailed visual recognition. In CVPR, pages 9719–9728, 2020.
  59. Balanced contrastive learning for long-tailed visual recognition. In CVPR, pages 6908–6917, 2022.
Citations (2)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 1 tweet and received 0 likes.

Upgrade to Pro to view all of the tweets about this paper: