A Comparative Study on Textual Saliency of Styles from Eye Tracking, Annotations, and Language Models

Published 19 Dec 2022 in cs.CL | (2212.09873v2)

Abstract: There is growing interest in incorporating eye-tracking data and other implicit measures of human language processing into NLP pipelines. The data from human language processing contain unique insight into human linguistic understanding that could be exploited by LLMs. However, many unanswered questions remain about the nature of this data and how it can best be utilized in downstream NLP tasks. In this paper, we present eyeStyliency, an eye-tracking dataset for human processing of stylistic text (e.g., politeness). We develop a variety of methods to derive style saliency scores over text using the collected eye dataset. We further investigate how this saliency data compares to both human annotation methods and model-based interpretability metrics. We find that while eye-tracking data is unique, it also intersects with both human annotations and model-based importance scores, providing a possible bridge between human- and machine-based perspectives. We propose utilizing this type of data to evaluate the cognitive plausibility of models that interpret style. Our eye-tracking data and processing code are publicly available.