Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Pre-Trained Language Models Augmented with Synthetic Scanpaths for Natural Language Understanding (2310.14676v1)

Published 23 Oct 2023 in cs.CL

Abstract: Human gaze data offer cognitive information that reflects natural language comprehension. Indeed, augmenting LLMs with human scanpaths has proven beneficial for a range of NLP tasks, including language understanding. However, the applicability of this approach is hampered because the abundance of text corpora is contrasted by a scarcity of gaze data. Although models for the generation of human-like scanpaths during reading have been developed, the potential of synthetic gaze data across NLP tasks remains largely unexplored. We develop a model that integrates synthetic scanpath generation with a scanpath-augmented LLM, eliminating the need for human gaze data. Since the model's error gradient can be propagated throughout all parts of the model, the scanpath generator can be fine-tuned to downstream tasks. We find that the proposed model not only outperforms the underlying LLM, but achieves a performance that is comparable to a LLM augmented with real human gaze data. Our code is publicly available.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Shuwen Deng (10 papers)
  2. Paul Prasse (10 papers)
  3. David R. Reich (7 papers)
  4. Tobias Scheffer (12 papers)
  5. Lena A. Jäger (14 papers)
Citations (3)

Summary

We haven't generated a summary for this paper yet.