Insights into LLM Long-Context Failures: When Transformers Know but Don't Tell (2406.14673v2)

Published 20 Jun 2024 in cs.CL

Abstract: LLMs exhibit positional bias, struggling to utilize information from the middle or end of long contexts. Our study explores LLMs' long-context reasoning by probing their hidden representations. We find that while LLMs encode the position of target information, they often fail to leverage this in generating accurate responses. This reveals a disconnect between information retrieval and utilization, a "know but don't tell" phenomenon. We further analyze the relationship between extraction time and final accuracy, offering insights into the underlying mechanics of transformer models.

PDF HTML Abstract

Summarize PDF Markdown Bookmark Chat (Pro)

Authors (5)

Taiming Lu (5 papers)
Muhan Gao (3 papers)
Kuai Yu (10 papers)
Adam Byerly (8 papers)
Daniel Khashabi (83 papers)

Citations (6)

View on Semantic Scholar

Tweets

https://twitter.com/BabakNabiee/status/1923212401584869518

HackerNews

Insights into LLM Long-Context Failures: When Transformers Know but Don't Tell (3 points, 0 comments)

Insights into LLM Long-Context Failures: When Transformers Know but Don't Tell (2406.14673v2)

Related Papers

Tweets

HackerNews