User responses to explicit displays of LLMs’ verbalized assumptions

Determine how end users respond during real interactions when a large language model explicitly displays its verbalized assumptions about the user’s intent and needs, and characterize how user identities and attitudes toward AI moderate these responses and experiences.

Background

The paper introduces Verbalized Assumptions, a framework for eliciting LLMs’ (LLMs) assumptions about users and shows that these assumptions are causally linked to sycophantic behaviors. It further demonstrates that steering internal representation subspaces associated with these assumptions can reduce social sycophancy while preserving overall model performance.

Building on evidence of a human–AI expectation gap—users often expect more objective information from AI than from other humans—the authors propose future directions that involve exposing these assumptions to users. However, they explicitly note that it remains unknown how users will react to being shown an LLM’s implicit assumptions and how individual differences (e.g., identities, attitudes toward AI) will shape these reactions.

References

It remains an open question how users will respond when explicitly confronted with the implicit assumptions that LLMs inevitably make, and how individual differences, like people's identities and attitudes toward AI, shape these expectations and experiences.

— Verbalizing LLMs' assumptions to explain and control sycophancy (2604.03058 - Cheng et al., 3 Apr 2026) in Ethical Statement

User responses to explicit displays of LLMs’ verbalized assumptions

Background

References

Related Problems