Inducing Character-level Structure in Subword-based Language Models with Type-level Interchange Intervention Training (2212.09897v2)

Published 19 Dec 2022 in cs.CL

Abstract: Language tasks involving character-level manipulations (e.g., spelling corrections, arithmetic operations, word games) are challenging for models operating on subword units. To address this, we develop a causal intervention framework to learn robust and interpretable character representations inside subword-based LLMs. Our method treats each character as a typed variable in a causal model and learns such causal structures by adapting the interchange intervention training method of Geiger et al. (2021). We additionally introduce a suite of character-level tasks that systematically vary in their dependence on meaning and sequence-level context. While character-level models still perform best on purely form-based tasks like string reversal, our method outperforms character-level models on more complex tasks that blend form, meaning, and context, such as spelling correction in context and word search games. Compared with standard subword-based models, our approach also significantly improves robustness on unseen token sequences and leads to human-interpretable internal representations of characters.

PDF HTML Abstract

Summarize Bookmark Chat (Pro)

References (44)

Authors (4)

Jing Huang (140 papers)
Zhengxuan Wu (37 papers)
Kyle Mahowald (40 papers)
Christopher Potts (113 papers)

Citations (12)

View on Semantic Scholar

Inducing Character-level Structure in Subword-based Language Models with Type-level Interchange Intervention Training (2212.09897v2)

Related Papers