From Tokens to Thoughts: How LLMs and Humans Trade Compression for Meaning (2505.17117v2)

Published 21 May 2025 in cs.CL, cs.AI, cs.IT, and math.IT

Abstract: Humans organize knowledge into compact categories through semantic compression by mapping diverse instances to abstract representations while preserving meaning (e.g., robin and blue jay are both birds; most birds can fly). These concepts reflect a trade-off between expressive fidelity and representational simplicity. LLMs demonstrate remarkable linguistic abilities, yet whether their internal representations strike a human-like trade-off between compression and semantic fidelity is unclear. We introduce a novel information-theoretic framework, drawing from Rate-Distortion Theory and the Information Bottleneck principle, to quantitatively compare these strategies. Analyzing token embeddings from a diverse suite of LLMs against seminal human categorization benchmarks, we uncover key divergences. While LLMs form broad conceptual categories that align with human judgment, they struggle to capture the fine-grained semantic distinctions crucial for human understanding. More fundamentally, LLMs demonstrate a strong bias towards aggressive statistical compression, whereas human conceptual systems appear to prioritize adaptive nuance and contextual richness, even if this results in lower compressional efficiency by our measures. These findings illuminate critical differences between current AI and human cognitive architectures, guiding pathways toward LLMs with more human-aligned conceptual representations.

PDF Abstract

This paper, "From Tokens to Thoughts: How LLMs and Humans Trade Compression for Meaning" (Shani et al., 21 May 2025 ), investigates whether LLMs develop internal conceptual representations that are analogous to human conceptual structures, particularly focusing on the trade-off between informational compression and semantic fidelity. The authors introduce a novel information-theoretic framework to quantitatively compare these strategies, leveraging seminal human categorization benchmarks.

The core research questions addressed are:

[RQ1]: To what extent do concepts emergent in LLMs align with human-defined conceptual categories?
[RQ2]: Do LLMs and humans exhibit similar internal geometric structures within these concepts, especially concerning item typicality?
[RQ3]: How do humans and LLMs differ in their strategies for balancing representational compression with the preservation of semantic fidelity when forming concepts?

To answer these questions, the paper utilizes data from three classic cognitive psychology experiments: Rosch (1973), Rosch (1975), and McCloskey & Glucksberg (1978). These datasets provide human judgments on category membership and item typicality for a total of 1,049 items across 34 categories, which the authors digitized and aggregated. A diverse suite of LLMs is analyzed, including encoder-only models (BERT family) and various decoder-only models (Llama, Gemma, Qwen, Phi, Mistral families) ranging from 300 million to 72 billion parameters. The analysis focuses on static token-level embeddings from the input embedding layer of these models.

The central methodological contribution is an information-theoretic framework inspired by Rate-Distortion Theory (RDT) and the Information Bottleneck (IB) principle. This framework evaluates conceptual clusters $C$ derived from items $X$ (token embeddings) using an objective function $\mathcal{L}$ :

$\mathcal{L}(X,C;\beta) = \text{Complexity}(X, C) + \beta \cdot \text{Distortion}(X, C)$

where:

Complexity(X,C) is the mutual information $I(X;C) = H(X) - H(X|C)$ , measuring the informational cost of representing items $X$ through clusters $C$ . Lower $I(X;C)$ implies greater compression.

$I(X;C) = \log_2 |X| - \frac{1}{|X|} \sum_{c \in C} |C_c| \log_2 |C_c|$
Distortion(X,C) is the average intra-cluster variance of item embeddings, quantifying semantic fidelity loss. Lower distortion means items are closer to their cluster centroids. $\text{Distortion}(X,C) = \frac{1}{|X|} \sum_{c \in C} |C_c| \cdot \sigma_c^2$ , where $\sigma_c^2 = \frac{1}{|C_c|} \sum_{x \in c} \|x - x_c\|^2$ . A lower $\mathcal{L}$ score indicates a more statistically "efficient" representation according to this framework.

The empirical investigation yields several key findings:

[RQ1] Broad Conceptual Alignment: LLM-derived clusters (using k-means on token embeddings, with $K$ set by human category counts) show significant alignment with human-defined conceptual categories, as measured by Adjusted Mutual Information (AMI), Normalized Mutual Information (NMI), and Adjusted Rand Index (ARI). Notably, some encoder models (like BERT-large-uncased) exhibit strong alignment, sometimes outperforming much larger decoder-only models, suggesting factors beyond scale influence human-like categorical abstraction.
[RQ2] Limited Fidelity to Fine-Grained Semantics: LLMs demonstrate only modest alignment with human-perceived fine-grained semantic distinctions, such as item typicality. This was assessed by correlating (Spearman's $\rho$ ) human typicality ratings with the cosine similarity between an item's token embedding and the embedding of its human-assigned category name (e.g., 'robin' to 'bird'). The correlations were generally weak, indicating LLMs do not consistently represent human-perceived typical items as significantly more similar to their category label's embedding.
[RQ3] Divergent Efficiency Strategies in Compression-Meaning Trade-off: LLMs exhibit markedly superior information-theoretic efficiency compared to human conceptual structures when evaluated by the $\mathcal{L}$ objective (with $\beta=1$ ) and mean cluster entropy. LLM-derived clusters consistently achieve lower (more "optimal" by this statistical measure) $\mathcal{L}$ values and lower entropy than human conceptual categories. This suggests LLMs are highly optimized for statistical compactness.

The discussion and conclusion highlight a fundamental divergence: LLMs appear optimized for aggressive statistical compression, likely due to their training on vast text corpora, achieving information-theoretically efficient representations. This focus, however, limits their ability to capture the richer, prototype-based semantic nuances crucial for deep human understanding. In contrast, human conceptual systems seem to prioritize adaptive richness, contextual flexibility, and functional utility. The apparent statistical "suboptimality" of human concepts (higher entropy and $\mathcal{L}$ scores) likely reflects optimization for a broader set of cognitive demands, such as robust generalization, inferential power, and effective communication.

The authors suggest that merely scaling current LLM approaches might be insufficient for achieving human-like understanding. Future AI development could benefit from incorporating principles that foster richer conceptual structures, potentially using frameworks like the $\mathcal{L}$ objective for guidance and evaluation. For cognitive science, LLMs serve as valuable computational models to test theories of human concept formation, highlighting the unique optimization pressures shaping human cognition. The paper concludes that moving "from tokens to thoughts" will require AI to embrace principles that cultivate richer, contextually-aware conceptual structures, recognizing that what appears as statistical "inefficiency" might be a haLLMark of robust intelligence.

PDF Markdown Bookmark Chat (Pro)

Authors (4)

Chen Shani (11 papers)
Dan Jurafsky (118 papers)
Yann LeCun (173 papers)
Ravid Shwartz-Ziv (31 papers)

Related Papers

Find Related Papers

Tweets

https://twitter.com/ziv_ravid/status/1928118810109673858

https://twitter.com/chaitjo/status/1937859459461882360

https://twitter.com/theomitsa/status/1931629019436335282

https://twitter.com/_akpiper/status/1928130763351785848

https://twitter.com/theomitsa/status/1931629058405666887

https://twitter.com/dair_ai/status/1931735802590986609

From Tokens to Thoughts: How LLMs and Humans Trade Compression for Meaning (2505.17117v2)

Related Papers

Tweets

YouTube

HackerNews

Reddit