Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
124 tokens/sec
GPT-4o
8 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Contrastive variational information bottleneck for aspect-based sentiment analysis (2303.02846v3)

Published 6 Mar 2023 in cs.CL and cs.AI

Abstract: Deep learning techniques have dominated the literature on aspect-based sentiment analysis (ABSA), achieving state-of-the-art performance. However, deep models generally suffer from spurious correlations between input features and output labels, which hurts the robustness and generalization capability by a large margin. In this paper, we propose to reduce spurious correlations for ABSA, via a novel Contrastive Variational Information Bottleneck framework (called CVIB). The proposed CVIB framework is composed of an original network and a self-pruned network, and these two networks are optimized simultaneously via contrastive learning. Concretely, we employ the Variational Information Bottleneck (VIB) principle to learn an informative and compressed network (self-pruned network) from the original network, which discards the superfluous patterns or spurious correlations between input features and prediction labels. Then, self-pruning contrastive learning is devised to pull together semantically similar positive pairs and push away dissimilar pairs, where the representations of the anchor learned by the original and self-pruned networks respectively are regarded as a positive pair while the representations of two different sentences within a mini-batch are treated as a negative pair. To verify the effectiveness of our CVIB method, we conduct extensive experiments on five benchmark ABSA datasets and the experimental results show that our approach achieves better performance than the strong competitors in terms of overall prediction performance, robustness, and generalization. Code and data to reproduce the results in this paper is available at: https://github.com/shesshan/CVIB.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (69)
  1. S. Negi, P. Buitelaar, Insight galway: Syntactic and lexical features for aspect based sentiment analysis, in: Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), 2014, pp. 346–350.
  2. Ubham: Lexical resources and dependency parsing for aspect-based sentiment analysis, in: Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), 2014, pp. 683–687.
  3. Attention-based LSTM for aspect-level sentiment classification, in: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Austin, Texas, 2016, pp. 606–615. URL: https://aclanthology.org/D16-1058. doi:10.18653/v1/D16-1058.
  4. Aspect level sentiment classification with deep memory network, in: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Austin, Texas, 2016, pp. 214–224. URL: https://aclanthology.org/D16-1021. doi:10.18653/v1/D16-1021.
  5. Attention based lstm for target dependent sentiment classification, Proceedings of the AAAI Conference on Artificial Intelligence 31 (2017).
  6. Interactive attention networks for aspect-level sentiment classification, in: Proceedings of the 26th International Joint Conference on Artificial Intelligence, IJCAI’17, AAAI Press, Melbourne, Australia, 2017, p. 4068–4074.
  7. Effective attention modeling for aspect-level sentiment classification, in: Proceedings of the 27th International Conference on Computational Linguistics, Association for Computational Linguistics, Santa Fe, New Mexico, USA, 2018, pp. 1121–1131. URL: https://aclanthology.org/C18-1096.
  8. Multi-grained attention network for aspect-level sentiment classification, in: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Brussels, Belgium, 2018, pp. 3433–3442. URL: https://aclanthology.org/D18-1380. doi:10.18653/v1/D18-1380.
  9. Hierarchical attention based position-aware network for aspect-level sentiment analysis, in: Proceedings of the 22nd Conference on Computational Natural Language Learning, Association for Computational Linguistics, Brussels, Belgium, 2018, pp. 181–189. URL: https://aclanthology.org/K18-1018. doi:10.18653/v1/K18-1018.
  10. B. Huang, K. Carley, Syntax-aware aspect level sentiment classification with graph attention networks, in: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Association for Computational Linguistics, Hong Kong, China, 2019, pp. 5469–5477. URL: https://aclanthology.org/D19-1549. doi:10.18653/v1/D19-1549.
  11. Aspect-based sentiment classification with aspect-specific graph convolutional networks, in: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Association for Computational Linguistics, Hong Kong, China, 2019, pp. 4568–4578. URL: https://aclanthology.org/D19-1464. doi:10.18653/v1/D19-1464.
  12. Aspect-level sentiment analysis via convolution over dependency tree, in: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Association for Computational Linguistics, Hong Kong, China, 2019, pp. 5679–5688. URL: https://aclanthology.org/D19-1569. doi:10.18653/v1/D19-1569.
  13. Relational graph attention network for aspect-based sentiment analysis, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Online, 2020, pp. 3229–3238. URL: https://aclanthology.org/2020.acl-main.295. doi:10.18653/v1/2020.acl-main.295.
  14. Aspect-based sentiment analysis with type-aware graph convolutional networks and layer ensemble, in: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics, Online, 2021, pp. 2910–2922. URL: https://aclanthology.org/2021.naacl-main.231. doi:10.18653/v1/2021.naacl-main.231.
  15. Phrase dependency relational graph attention network for aspect-based sentiment analysis, Knowledge-Based Systems 236 (2022) 107736.
  16. Aspect-based sentiment analysis via affective knowledge enhanced graph convolutional networks, Knowledge-Based Systems 235 (2022) 107643.
  17. BERT: Pre-training of deep bidirectional transformers for language understanding, in: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Association for Computational Linguistics, Minneapolis, Minnesota, 2019, pp. 4171–4186. URL: https://aclanthology.org/N19-1423. doi:10.18653/v1/N19-1423.
  18. Roberta: A robustly optimized bert pretraining approach, arXiv preprint arXiv:1907.11692 (2019).
  19. Attentional encoder network for targeted sentiment classification, arXiv preprint arXiv:1902.09314 (2019).
  20. A challenge dataset and effective models for aspect-based sentiment analysis, in: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Association for Computational Linguistics, Hong Kong, China, 2019, pp. 6280–6285. URL: https://aclanthology.org/D19-1654. doi:10.18653/v1/D19-1654.
  21. Does syntax matter? a strong baseline for aspect-based sentiment analysis with RoBERTa, in: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics, Online, 2021, pp. 1816–1829. URL: https://aclanthology.org/2021.naacl-main.146. doi:10.18653/v1/2021.naacl-main.146.
  22. Incorporating dynamic semantics into pre-trained language model for aspect-based sentiment analysis, in: Findings of the Association for Computational Linguistics: ACL 2022, Association for Computational Linguistics, Dublin, Ireland, 2022, pp. 3599–3610. URL: https://aclanthology.org/2022.findings-acl.285. doi:10.18653/v1/2022.findings-acl.285.
  23. Ask-roberta: A pretraining model for aspect-based sentiment classification via sentiment knowledge mining, Knowledge-Based Systems 253 (2022) 109511.
  24. A survey on aspect-based sentiment analysis: Tasks, methods, and challenges, IEEE Transactions on Knowledge and Data Engineering (2022) 1–20.
  25. Tasty burgers, soggy fries: Probing aspect robustness in aspect-based sentiment analysis, in: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics, Online, 2020, pp. 3594–3605. URL: https://aclanthology.org/2020.emnlp-main.292. doi:10.18653/v1/2020.emnlp-main.292.
  26. The information bottleneck method, arXiv preprint physics/0004057 (2000).
  27. B. Wang, W. Lu, Learning latent opinions for aspect-level sentiment classification, Proceedings of the AAAI Conference on Artificial Intelligence 32 (2018).
  28. Dual graph convolutional networks for aspect-based sentiment analysis, in: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Association for Computational Linguistics, Online, 2021, pp. 6319–6329. URL: https://aclanthology.org/2021.acl-long.494. doi:10.18653/v1/2021.acl-long.494.
  29. Sentiment interaction and multi-graph perception with graph convolutional networks for aspect-based sentiment analysis, Knowledge-Based Systems 256 (2022) 109840.
  30. Z. Wu, D. C. Ong, Context-guided bert for targeted aspect-based sentiment analysis, Proceedings of the AAAI Conference on Artificial Intelligence 35 (2021) 14094–14102.
  31. BERT post-training for review reading comprehension and aspect-based sentiment analysis, in: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Association for Computational Linguistics, Minneapolis, Minnesota, 2019, pp. 2324–2335. URL: https://aclanthology.org/N19-1242. doi:10.18653/v1/N19-1242.
  32. R. Jia, P. Liang, Adversarial examples for evaluating reading comprehension systems, in: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Copenhagen, Denmark, 2017, pp. 2021–2031. URL: https://aclanthology.org/D17-1215. doi:10.18653/v1/D17-1215.
  33. Annotation artifacts in natural language inference data, in: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), Association for Computational Linguistics, New Orleans, Louisiana, 2018, pp. 107–112. URL: https://aclanthology.org/N18-2017. doi:10.18653/v1/N18-2017.
  34. D. Kaushik, Z. C. Lipton, How much reading does reading comprehension require? a critical investigation of popular benchmarks, in: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Brussels, Belgium, 2018, pp. 5010–5015. URL: https://aclanthology.org/D18-1546. doi:10.18653/v1/D18-1546.
  35. Behavior analysis of NLI models: Uncovering the influence of three factors on robustness, in: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), Association for Computational Linguistics, New Orleans, Louisiana, 2018, pp. 1975–1985. URL: https://aclanthology.org/N18-1179. doi:10.18653/v1/N18-1179.
  36. Right for the wrong reasons: Diagnosing syntactic heuristics in natural language inference, in: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Florence, Italy, 2019, pp. 3428–3448. URL: https://aclanthology.org/P19-1334. doi:10.18653/v1/P19-1334.
  37. T. Niven, H.-Y. Kao, Probing neural network comprehension of natural language arguments, in: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Florence, Italy, 2019, pp. 4658–4664. URL: https://aclanthology.org/P19-1459. doi:10.18653/v1/P19-1459.
  38. On the impact of spurious correlation for out-of-distribution detection, Proceedings of the AAAI Conference on Artificial Intelligence 36 (2022) 10051–10059.
  39. Is out-of-distribution detection learnable?, in: S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, A. Oh (Eds.), Advances in Neural Information Processing Systems, volume 35, Curran Associates, Inc., 2022, pp. 37199–37213. URL: https://proceedings.neurips.cc/paper_files/paper/2022/file/f0e91b1314fa5eabf1d7ef6d1561ecec-Paper-Conference.pdf.
  40. Learning bounds for open-set learning, in: M. Meila, T. Zhang (Eds.), Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, PMLR, 2021, pp. 3122–3132. URL: https://proceedings.mlr.press/v139/fang21c.html.
  41. HellaSwag: Can a machine really finish your sentence?, in: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Florence, Italy, 2019, pp. 4791–4800. URL: https://aclanthology.org/P19-1472. doi:10.18653/v1/P19-1472.
  42. Learning the difference that makes A difference with counterfactually-augmented data, in: 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020, OpenReview.net, 2020. URL: https://openreview.net/forum?id=Sklgs0NFvr.
  43. Winogrande: An adversarial winograd schema challenge at scale, Commun. ACM 64 (2021) 99–106.
  44. Adversarial NLI: A new benchmark for natural language understanding, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Online, 2020, pp. 4885–4901. URL: https://aclanthology.org/2020.acl-main.441. doi:10.18653/v1/2020.acl-main.441.
  45. Z. Wang, A. Culotta, Robustness to spurious correlations in text classification via automatically generated counterfactuals, Proceedings of the AAAI Conference on Artificial Intelligence 35 (2021) 14024–14031.
  46. Generating data to mitigate spurious correlations in natural language inference datasets, in: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Dublin, Ireland, 2022, pp. 2660–2676. URL: https://aclanthology.org/2022.acl-long.190. doi:10.18653/v1/2022.acl-long.190.
  47. Learning to model and ignore dataset bias with mixed capacity ensembles, in: Findings of the Association for Computational Linguistics: EMNLP 2020, Association for Computational Linguistics, Online, 2020, pp. 3031–3045. URL: https://aclanthology.org/2020.findings-emnlp.272. doi:10.18653/v1/2020.findings-emnlp.272.
  48. End-to-end bias mitigation by modelling biases in corpora, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Online, 2020, pp. 8706–8716. URL: https://aclanthology.org/2020.acl-main.769. doi:10.18653/v1/2020.acl-main.769.
  49. Mind the trade-off: Debiasing NLU models without degrading the in-distribution performance, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Online, 2020, pp. 8717–8729. URL: https://aclanthology.org/2020.acl-main.770. doi:10.18653/v1/2020.acl-main.770.
  50. Learning from others’ mistakes: Avoiding dataset biases without modeling them, in: International Conference on Learning Representations, 2021. URL: https://openreview.net/forum?id=Hf3qXoiNkR.
  51. Towards interpreting and mitigating shortcut learning behavior of NLU models, in: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics, Online, 2021, pp. 915–929. URL: https://aclanthology.org/2021.naacl-main.71. doi:10.18653/v1/2021.naacl-main.71.
  52. Towards debiasing dnn models from spurious feature influence, Proceedings of the AAAI Conference on Artificial Intelligence 36 (2022) 9521–9528.
  53. Avoiding the Hypothesis-Only Bias in Natural Language Inference via Ensemble Adversarial Training, in: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics, Online, 2020, pp. 8281–8291. URL: https://aclanthology.org/2020.emnlp-main.665. doi:10.18653/v1/2020.emnlp-main.665.
  54. Re-embedding difficult samples via mutual information constrained semantically oversampling for imbalanced text classification, in: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Online and Punta Cana, Dominican Republic, 2021, pp. 3148–3161. URL: https://aclanthology.org/2021.emnlp-main.252. doi:10.18653/v1/2021.emnlp-main.252.
  55. Predicting inductive biases of pre-trained models, in: International Conference on Learning Representations, 2021. URL: https://openreview.net/forum?id=mNtmhaDkAr.
  56. Examining and combating spurious features under distribution shift, in: M. Meila, T. Zhang (Eds.), Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, PMLR, 2021, pp. 12857–12867. URL: https://proceedings.mlr.press/v139/zhou21g.html.
  57. Variational information bottleneck for effective low-resource fine-tuning, in: International Conference on Learning Representations, 2021. URL: https://openreview.net/forum?id=kvhzKz-_DMF.
  58. Deep variational information bottleneck, arXiv preprint arXiv:1612.00410 (2016).
  59. Variational recurrent auto-encoders, in: ICLR (Workshop), 2015.
  60. Compressing neural networks using the variational information bottleneck, in: J. Dy, A. Krause (Eds.), Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, PMLR, Stockholm, Sweden, 2018, pp. 1135–1144. URL: https://proceedings.mlr.press/v80/dai18d.html.
  61. A variational information bottleneck based method to compress sequential networks for human action recognition, in: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2021, pp. 2745–2754.
  62. A simple framework for contrastive learning of visual representations, in: H. D. III, A. Singh (Eds.), Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, PMLR, Vienna, Austria, 2020, pp. 1597–1607. URL: https://proceedings.mlr.press/v119/chen20j.html.
  63. Supervised contrastive learning, in: H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, H. Lin (Eds.), Advances in Neural Information Processing Systems, volume 33, Curran Associates, Inc., 2020, pp. 18661–18673. URL: https://proceedings.neurips.cc/paper/2020/file/d89a66c7c80a29b1bdbab0f2a1a94af8-Paper.pdf.
  64. SimCSE: Simple contrastive learning of sentence embeddings, in: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Online and Punta Cana, Dominican Republic, 2021, pp. 6894–6910. URL: https://aclanthology.org/2021.emnlp-main.552. doi:10.18653/v1/2021.emnlp-main.552.
  65. Transformation networks for target-oriented sentiment classification, in: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Melbourne, Australia, 2018, pp. 946–956. URL: https://aclanthology.org/P18-1087. doi:10.18653/v1/P18-1087.
  66. M. Zhang, T. Qian, Convolution over hierarchical syntactic and lexical graphs for aspect level sentiment analysis, in: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics, Online, 2020, pp. 3540–3549. URL: https://aclanthology.org/2020.emnlp-main.286. doi:10.18653/v1/2020.emnlp-main.286.
  67. SemEval-2014 task 4: Aspect based sentiment analysis, in: Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), Association for Computational Linguistics, Dublin, Ireland, 2014, pp. 27–35. URL: https://aclanthology.org/S14-2004. doi:10.3115/v1/S14-2004.
  68. SemEval-2015 task 12: Aspect based sentiment analysis, in: Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), Association for Computational Linguistics, Denver, Colorado, 2015, pp. 486–495. URL: https://aclanthology.org/S15-2082. doi:10.18653/v1/S15-2082.
  69. SemEval-2016 task 5: Aspect based sentiment analysis, in: Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), Association for Computational Linguistics, San Diego, California, 2016, pp. 19–30. URL: https://aclanthology.org/S16-1002. doi:10.18653/v1/S16-1002.
Citations (3)

Summary

We haven't generated a summary for this paper yet.