Towards Faithful Explanations: Boosting Rationalization with Shortcuts Discovery (2403.07955v2)
Abstract: The remarkable success in neural networks provokes the selective rationalization. It explains the prediction results by identifying a small subset of the inputs sufficient to support them. Since existing methods still suffer from adopting the shortcuts in data to compose rationales and limited large-scale annotated rationales by human, in this paper, we propose a Shortcuts-fused Selective Rationalization (SSR) method, which boosts the rationalization by discovering and exploiting potential shortcuts. Specifically, SSR first designs a shortcuts discovery approach to detect several potential shortcuts. Then, by introducing the identified shortcuts, we propose two strategies to mitigate the problem of utilizing shortcuts to compose rationales. Finally, we develop two data augmentations methods to close the gap in the number of annotated rationales. Extensive experimental results on real-world datasets clearly validate the effectiveness of our proposed method.
- Deep variational information bottleneck. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings, 2017.
- Anthropic. Introducing claude. Anthropic Blogs, 2023. URL https://www.anthropic.com/index/introducing-claude.
- Rationalization through concepts. In Findings of the Association for Computational Linguistics: ACL/IJCNLP2021, Online Event, August 1-6, 2021, 2021.
- Multi-dimensional explanation of target variables from documents. In Proceedings of the AAAI Conference on Artificial Intelligence, 2021.
- Deriving machine attention from human rationales. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 2018.
- Interpretable neural predictions with differentiable binary variables. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL), July 2019.
- Self-training with few-shot rationalization: Teacher explanations aid student in few-shot nlu. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021.
- What to learn, and how: Toward effective learning from rationales. In Smaranda Muresan, Preslav Nakov, and Aline Villavicencio (eds.), Findings of the Association for Computational Linguistics: ACL 2022, 2022.
- Unirex: A unified learning framework for language model rationale extraction. In International Conference on Machine Learning, pp. 2867–2889. PMLR, 2022.
- A game theoretic approach to class-wise selective rationalization. Advances in Neural Information Processing Systems (NeurIPS), 32, 2019.
- Invariant rationalization. In Proceedings of the 37th International Conference on Machine Learning, (ICML), 2020.
- Learning variational word masks to improve the interpretability of neural text classifiers. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, November 16-20, 2020, pp. 4236–4251, 2020.
- BoolQ: Exploring the surprising difficulty of natural yes/no questions. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 2019.
- Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), pp. 4171–4186.
- ERASER: A benchmark to evaluate rationalized NLP models. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5-10, 2020, 2020.
- Pretrained transformers improve out-of-distribution robustness. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020.
- Distribution matching for rationalization. In Thirty-Fifth AAAI Conference on Artificial Intelligence, (AAAI), 2021.
- Categorical reparametrization with gumble-softmax. In International Conference on Learning Representations (ICLR). OpenReview. net, 2017.
- Billion-scale similarity search with gpus. IEEE Transactions on Big Data, 7(3):535–547, 2019.
- Looking beyond the surface: A challenge set for reading comprehension over multiple sentences. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 252–262, 2018.
- FiD-ex: Improving sequence-to-sequence models for extractive rationale generation. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021.
- Inferring which medical treatments work from reports of clinical trials. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 2019.
- Rationalizing neural predictions. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2016.
- Unifying model explainability and robustness for joint text classification and rationale extraction. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pp. 10947–10955, 2022a.
- Learning invariant graph representations for out-of-distribution generalization. In Advances in Neural Information Processing Systems, 2022b.
- Fr: Folded rationalization with a unified encoder. Advances in neural information processing systems (NeurIPS), 2022.
- D-separation for causal self-explanation. Advances in Neural Information Processing Systems, 2023.
- Decoupled weight decay regularization. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net, 2019.
- Learning word vectors for sentiment analysis. In Proceedings of the 49th annual meeting of the association for computational linguistics: Human language technologies, pp. 142–150, 2011.
- OpenAI. Introducing chatgpt. OpenAI Blogs, 2023. URL https://openai.com/blog/chatgpt.
- Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744, 2022.
- Bo Pang and Lillian Lee. A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. Proceedings of the 42th Annual Meeting of the Association for Computational Linguistics (ACL), 2004.
- An information bottleneck approach for controlling conciseness in rationale extraction. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), November 2020.
- Weakly- and semi-supervised evidence extraction. In Findings of the Association for Computational Linguistics: EMNLP 2020, 2020.
- Learning from the best: Rationalizing prediction by adversarial information calibration. In Proceedings of the 35th AAAI Conference on Artificial Intelligence (AAAI), 2021.
- Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the 2013 conference on empirical methods in natural language processing, 2013.
- Fever: a large-scale dataset for fact extraction and verification. arXiv preprint arXiv:1803.05355, 2018.
- Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, 2023.
- Ronald J Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning, 1992.
- Discovering invariant rationales for graph neural networks. In ICLR, 2022.
- Rethinking cooperative rationalization: Introspective extraction and complement control. In Empirical Methods in Natural Language Processing (EMNLP), 2019.
- Understanding interlocking dynamics of cooperative rationalization. Advances in Neural Information Processing Systems (NeurIPS), 34, 2021.
- Dare: Disentanglement-augmented rationale extraction. In Advances in Neural Information Processing Systems, 2022a.
- Interventional rationalization. 2022b.
- Cooperative classification and rationalization for graph generalization. In Proceedings of the ACM Web Conference 2024, 2024.