Comparing Explanation Faithfulness between Multilingual and Monolingual Fine-tuned Language Models (2403.12809v1)

Published 19 Mar 2024 in cs.CL and cs.AI

Abstract: In many real natural language processing application scenarios, practitioners not only aim to maximize predictive performance but also seek faithful explanations for the model predictions. Rationales and importance distribution given by feature attribution methods (FAs) provide insights into how different parts of the input contribute to a prediction. Previous studies have explored how different factors affect faithfulness, mainly in the context of monolingual English models. On the other hand, the differences in FA faithfulness between multilingual and monolingual models have yet to be explored. Our extensive experiments, covering five languages and five popular FAs, show that FA faithfulness varies between multilingual and monolingual models. We find that the larger the multilingual model, the less faithful the FAs are compared to its counterpart monolingual models.Our further analysis shows that the faithfulness disparity is potentially driven by the differences between model tokenizers. Our code is available: https://github.com/casszhao/multilingual-faith.

Authors (2)

Zhixue Zhao (23 papers)
Nikolaos Aletras (72 papers)

Citations (1)

View on Semantic Scholar

Summary

Comparing Explanation Faithfulness between Multilingual and Monolingual Fine-tuned LLMs

Introduction

Recent advancements in NLP have emphasized the importance of understanding model behaviors beyond mere predictive performance. Particularly, the faithfulness of feature attribution methods (FAs) in providing insights into model reasoning has garnered significant attention. This paper provides a comprehensive quantitative analysis of FA faithfulness across multilingual and monolingual fine-tuned LLMs. Through extensive experimentation involving five languages, five FAs, and a diverse set of NLP tasks, the paper unveils critical insights into how FA faithfulness varies between multilingual and monolingual contexts.

Empirical Study Design

The core of our empirical investigation encompasses a broad spectrum of languages (English, Chinese, Spanish, French, and Hindi), leveraging models with similar architectures but differing in pre-training specifics and supported vocabularies. We fine-tuned two multilingual (mBERT and XLM-R) and ten monolingual models (two for each language) across fifteen datasets encompassing a wide variety of NLP tasks, including sentiment analysis, topic classification, and natural language inference. Popular FAs such as Attention, Scaled Attention, InputXGrad, Integrated Gradients, and DeepLift were evaluated in terms of faithfulness criteria like Sufficiency and Comprehensiveness, alongside soft versions of these metrics to account for the proportionality of input perturbations.

Key Findings

Our results reveal intriguing disparities in FA faithfulness:

Multilingual Model Size Impact: We observed a consistent pattern where larger multilingual models displayed less FA faithfulness compared to their monolingual counterparts, suggesting an inherent complexity within larger models that challenges the fidelity of FAs. Notably, this pattern was more pronounced in models based on the RoBERTa architecture.
Tokenization Differences: The paper further sheds light on tokenization as a significant factor contributing to FA faithfulness disparities. Multilingual models, particularly those utilizing SentencePiece tokenization (like XLM-R), exhibited more aggressive token splitting compared to monolingual models, impacting the faithfulness metrics.
FA Performance Variability: Integrated Gradients showed the largest difference in faithfulness scores across model types, highlighting the potential sensitivity of certain FAs to the underlying model and tokenization strategies.

Implications and Future Directions

This paper underscores the necessity of considering the choice between multilingual and monolingual models in applications requiring reliable explanations of model predictions. The observed faithfulness disparities, driven by factors like model size and tokenization, prompt a reevaluation of FA application in multilingual settings. Looking ahead, these findings pave the way for further research into optimizing FAs for multilingual models and developing new metrics capable of capturing the nuanced behavior of these models more effectively. Future studies might also explore the adaptability of FAs to low-resource languages and the potential of monolingual models in enhancing FA faithfulness across diverse linguistic landscapes.

PDF Markdown

Related Papers

Tweets

https://twitter.com/WikiResearch/status/1772210837211451581

https://twitter.com/ckarouzos/status/1798830823782932774