Papers
Topics
Authors
Recent
Search
2000 character limit reached

Revisiting the Role of Label Smoothing in Enhanced Text Sentiment Classification

Published 11 Dec 2023 in cs.CL, cs.AI, and cs.LG | (2312.06522v2)

Abstract: Label smoothing is a widely used technique in various domains, such as text classification, image classification and speech recognition, known for effectively combating model overfitting. However, there is little fine-grained analysis on how label smoothing enhances text sentiment classification. To fill in the gap, this article performs a set of in-depth analyses on eight datasets for text sentiment classification and three deep learning architectures: TextCNN, BERT, and RoBERTa, under two learning schemes: training from scratch and fine-tuning. By tuning the smoothing parameters, we can achieve improved performance on almost all datasets for each model architecture. We further investigate the benefits of label smoothing, finding that label smoothing can accelerate the convergence of deep models and make samples of different labels easily distinguishable.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (58)
  1. Exploitation of temporal structure in momentum-sgd for gradient compression. In 2021 11th International Symposium on Topics in Coding (ISTC), pages 1–5.
  2. Network coordinate system using non-negative matrix factorization based on kl divergence. In 2017 19th International Conference on Advanced Communication Technology (ICACT), pages 193–198.
  3. Revisiting label smoothing and knowledge distillation compatibility: What was missing? In International Conference on Machine Learning, pages 2890–2916. PMLR.
  4. Taming pretrained transformers for extreme multi-label text classification. In Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining, pages 3163–3171.
  5. An investigation of how label smoothing affects generalization. arXiv preprint arXiv:2010.12648.
  6. Recurrent attention network on memory for aspect sentiment analysis. In Proceedings of the 2017 conference on empirical methods in natural language processing, pages 452–461.
  7. Ner in hindi language using transformer model:xlm-roberta. In 2022 IEEE International Conference on Blockchain and Distributed Systems Security (ICBDS), pages 1–5.
  8. Improving generalization of deep neural network acoustic models with length perturbation and n-best based label smoothing. In Annual Conference of the International Speech Communication Association.
  9. Deep label distribution learning with label ambiguity. volume 26, pages 2825–2838.
  10. Towards a better understanding of label smoothing in neural machine translation. In Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing, pages 212–223.
  11. Geng, X. (2016). Label distribution learning. volume 28, pages 1734–1748.
  12. Label smoothing improves neural source code summarization. arXiv preprint arXiv:2303.16178.
  13. Online review based sentiment classification on bangladesh airline service using supervised learning. In 2021 5th International Conference on Electrical Engineering and Information Communication Technology (ICEEICT), pages 1–6. IEEE.
  14. Learning utterance-level representations with label smoothing for speech emotion recognition. In Interspeech, pages 4079–4083.
  15. Jiang, L. (2022). Fault classification method of alarm information based on textcnn. In EEI 2022; 4th International Conference on Electronic Engineering and Informatics, pages 1–5.
  16. Lightxml: Transformer with dynamic negative sampling for high-performance extreme multi-label text classification. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 7987–7994.
  17. An enhanced context-based emotion detection model using roberta. In 2022 IEEE International Conference on Electronics, Computing and Communication Technologies (CONECCT), pages 1–6.
  18. Text classification algorithms: A survey. Information, 10(4):150.
  19. A survey on text classification: From traditional to deep learning. ACM Transactions on Intelligent Systems and Technology (TIST), 13(2):1–41.
  20. Regularization via structural label smoothing. In Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, volume 108 of Proceedings of Machine Learning Research, pages 1453–1463.
  21. From label smoothing to label relaxation. In Proceedings of the AAAI conference on artificial intelligence, volume 35, pages 8583–8591.
  22. Co-attention network with label embedding for text classification. Neurocomputing, 471:61–69.
  23. Recurrent neural network for text classification with multi-task learning. arXiv preprint arXiv:1605.05101.
  24. Label smoothing for text mining. In Proceedings of the 29th International Conference on Computational Linguistics, pages 2210–2219.
  25. Combining context-relevant features with multi-stage attention network for short text classification. Computer Speech & Language, 71:101268.
  26. Does label smoothing mitigate label noise? In International Conference on Machine Learning, pages 6448–6458. PMLR.
  27. An early prediction and label smoothing alignment strategy for user intent classification of medical queries. In International Conference on Neural Computing for Advanced Applications, pages 115–128. Springer.
  28. Smoothing with fake label. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management, pages 3303–3307.
  29. Deep learning–based text classification: a comprehensive review. ACM computing surveys (CSUR), 54(3):1–40.
  30. Acl-roberta-cnn text classification model combined with contrastive learning. In 2021 International Conference on Big Data Engineering and Education (BDEE), pages 193–197.
  31. When does label smoothing help? Advances in neural information processing systems, 32.
  32. Onan, A. (2022). Bidirectional convolutional recurrent neural network architecture with group-wise enhancement mechanism for text sentiment classification. Journal of King Saud University - Computer and Information Sciences, 34(5):2098–2117.
  33. T-bert–model for sentiment analysis of micro-blogs integrating topic model and bert. arXiv preprint arXiv:2106.01097.
  34. An efficient cnn-lstm network with spectral normalization and label smoothing technologies for ssvep frequency recognition. Journal of Neural Engineering, 19(5):056014.
  35. Thumbs up? sentiment classification using machine learning techniques. In Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP 2002), pages 79–86.
  36. Potato leaf disease prediction using rmsprop, adam and sgd optimizers. In 2023 International Conference on Advancement in Computation & Computer Technologies (InCACCT), pages 343–347.
  37. Named entity recognition of benefit enterprise policy based on roberta_wwm_ext-bilstm-crf. In 2022 International Conference on Algorithms, Data Mining, and Information Technology (ADMIT), pages 140–146.
  38. Tm-bert: A twitter modified bert for sentiment analysis on covid-19 vaccination tweets. In 2022 2nd International Conference on Digital Futures and Transformative Technologies (ICoDT2), pages 1–6.
  39. Bert for natural language processing in bahasa indonesia. In 2022 2nd International Conference on Intelligent Cybernetics Technology & Applications (ICICyTA), pages 204–209.
  40. Towards speaker age estimation with label distribution learning. In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 4618–4622. IEEE.
  41. Students need more attention: Bert-based attention model for small data with application to automatic patient message triage. In Machine Learning for Healthcare Conference, pages 436–456. PMLR.
  42. Adaptive step-size methods for compressed sgd. In ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1–5.
  43. Textcnn-based text classification for e-government. In 2019 6th International Conference on Information Science and Control Engineering (ICISCE), pages 929–934.
  44. Vietnamese question answering system f rom multilingual bert models to monolingual bert model. In 2020 9th International Conference System Modeling and Advancement in Research Trends (SMART), pages 201–206.
  45. Cross entropy profiling to test pattern synchrony in short-term signals. In 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pages 737–740.
  46. Sentiment analysis model based on adaptive multi-modal feature fusion. In 2022 7th International Conference on Intelligent Computing and Signal Processing (ICSP), pages 761–766. IEEE.
  47. Medical intention recognition based on mcbert-textcnn model. In 2022 International Conference on Virtual Reality, Human-Computer Interaction and Artificial Intelligence (VRHCIAI), pages 195–200.
  48. Text smoothing: Enhance various data augmentation methods on text classification tasks. arXiv preprint arXiv:2202.13840.
  49. A method based on roberta_seq2seq for chinese text multi label sentiment analysis. In 2022 International Conference on Machine Learning and Knowledge Engineering (MLKE), pages 88–92.
  50. Polarity-aware attention network for image sentiment analysis. Multimedia Systems, 29(1):389–399.
  51. In and out-of-domain text adversarial robustness via label smoothing. arXiv preprint arXiv:2212.10258.
  52. Chinese mineral named entity recognition based on bert model. Expert Systems with Applications, 206:117727.
  53. A textcnn based approach for multi-label text classification of power fault data. In 2020 IEEE 5th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pages 179–183.
  54. Character-level convolutional networks for text classification. Advances in neural information processing systems, 28.
  55. Sentiment classification using comprehensive attention recurrent models. In 2016 International joint conference on neural networks (IJCNN), pages 1562–1569. IEEE.
  56. Boundary smoothing for named entity recognition. arXiv preprint arXiv:2204.12031.
  57. Utilizing bert intermediate layers for multimodal sentiment analysis. In 2022 IEEE International Conference on Multimedia and Expo (ICME), pages 1–6.
  58. A comparative review on deep learning models for text classification. Indones. J. Electr. Eng. Comput. Sci, 19(1):325–335.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.