Causal Intersectionality and Dual Form of Gradient Descent for Multimodal Analysis: a Case Study on Hateful Memes (2308.11585v2)

Published 19 Aug 2023 in cs.AI and cs.CL

Abstract: Amidst the rapid expansion of Machine Learning (ML) and LLMs, understanding the semantics within their mechanisms is vital. Causal analyses define semantics, while gradient-based methods are essential to eXplainable AI (XAI), interpreting the model's 'black box'. Integrating these, we investigate how a model's mechanisms reveal its causal effect on evidence-based decision-making. Research indicates intersectionality - the combined impact of an individual's demographics - can be framed as an Average Treatment Effect (ATE). This paper demonstrates that hateful meme detection can be viewed as an ATE estimation using intersectionality principles, and summarized gradient-based attention scores highlight distinct behaviors of three Transformer models. We further reveal that LLM Llama-2 can discern the intersectional aspects of the detection through in-context learning and that the learning process could be explained via meta-gradient, a secondary form of gradient. In conclusion, this work furthers the dialogue on Causality and XAI. Our code is available online (see External Resources section).

PDF HTML Abstract

Summarize Bookmark Chat (Pro)

References (70)

Authors (2)

Yosuke Miyanishi (3 papers)
Minh Le Nguyen (17 papers)

Citations (2)

View on Semantic Scholar

Causal Intersectionality and Dual Form of Gradient Descent for Multimodal Analysis: a Case Study on Hateful Memes (2308.11585v2)

Related Papers