Privileged Self-Access Matters for Introspection in AI (2508.14802v1)

Published 20 Aug 2025 in cs.AI and cs.CL

Abstract: Whether AI models can introspect is an increasingly important practical question. But there is no consensus on how introspection is to be defined. Beginning from a recently proposed ''lightweight'' definition, we argue instead for a thicker one. According to our proposal, introspection in AI is any process which yields information about internal states through a process more reliable than one with equal or lower computational cost available to a third party. Using experiments where LLMs reason about their internal temperature parameters, we show they can appear to have lightweight introspection while failing to meaningfully introspect per our proposed definition.

Collections

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Paper Prompts

Explore 10 Community Prompts

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Generate Now

Authors (4)

Tweets

https://twitter.com/DengHokin/status/1958360714088706111

https://twitter.com/siyuansong_/status/1960368503980003420

https://twitter.com/alexhavryleshko/status/1958496641863840135

alphaXiv

Privileged Self-Access Matters for Introspection in AI (9 likes, 0 questions)