Extent of LLM propositional reasoning-based imagery capacity beyond human working memory limits

Ascertain the extent of propositional reasoning-based mental imagery capacity in large language models when unconstrained by human working memory limits by evaluating performance on instruction sets with substantially more than 3–5 steps and more than 4 imagined objects and characterizing how accuracy scales with task complexity.

Background

Humans face working memory constraints (classically described as 7±2 items, with contemporary estimates closer to 3–4), which limit the complexity of imagery tasks that can be benchmarked with human participants.

LLMs are only constrained by their context windows, allowing evaluation on tasks that exceed human working memory limits. The authors state that the extent of LLMs’ propositional reasoning-based imagery capacity is presently unknown and propose creating much longer, more compositional trials to probe it.

References

Without this limitation, it is unknown the extent of LLMs propositional reasoning capacity.

— Artificial Phantasia: Evidence for Propositional Reasoning-Based Mental Imagery in Large Language Models (2509.23108 - McCarty et al., 27 Sep 2025) in Section "Future Work"

Extent of LLM propositional reasoning-based imagery capacity beyond human working memory limits

Background

References

Related Problems