State over Tokens (SoT) Overview

Updated 21 December 2025

State over Tokens (SoT) is a formal framework that models persistent information state across sequences in domains such as digital currencies, language models, and Transformer heads.
It distinguishes between token-based and account-based approaches, impacting security, interpretability, and scalability in various applications.
Empirical studies confirm that SoT enhances reasoning accuracy in LLMs, improves classification in Transformers, and clarifies state management in blockchain systems.

State over Tokens (SoT) refers to a set of formal frameworks and design philosophies arising independently in digital currency theory, LLM interpretability, and Transformer-based classification heads. Across these domains, SoT captures a decisive shift: rather than treating tokens or intermediate representations as merely labels, words, or unitary coins, modern approaches acknowledge and exploit the way state—the persistent, evolving information in a computation or protocol—is encoded over sequences of tokens or transferable objects. SoT distinctions fundamentally impact security, interpretability, scalability, and the kinds of computation or reasoning achievable.

1. Formal Models of State over Tokens

In both digital currency economics and autoregressive model computation, SoT privileges the explicit representation of a global state as a fundamental organizing principle.

In digital currencies, Chan (Chan, 2021) grounds SoT in formal economic theory: a token-based system is one in which global state is object-oriented—each token (e.g., a coin or digital asset) is an explicit, permutation-invariant entry with a unique identity and owner, so the state space is $S_t^{\mathrm{token}} = \{ (o_k, u_i) : o_k \in \mathcal{O}, u_i \in \mathcal{U} \}$ , where $\mathcal{O}$ is the set of objects/tokens and $\mathcal{U}$ the set of users. In contrast, an account-based system’s state is $S_t^{\mathrm{acc}} = \{ (u_i, b_i) : u_i \in \mathcal{U}, b_i \in \mathbb{R} \}$ .

In LLM computation (Levy et al., 14 Dec 2025), SoT arises from the autoregressive update process: for LLMs, each generation cycle defines a state $s_t$ as the prefix sequence of all previously emitted tokens. The only persistent memory across cycles is this state vector, not the internal activations of the model. Formally,

$s_t = \langle x_1,x_2,...,x_{t-1}\rangle ;\quad x_t \sim p(x_t|s_t) ;\quad s_{t+1}=s_t\oplus x_t$

where $\oplus$ denotes token concatenation. Thus, the sequence of externalized tokens encodes all state required for subsequent steps.

2. SoT in Digital Currencies: State Representations and Economic Implications

The SoT framework in digital currencies was systematized by Chan (Chan, 2021), who demonstrated that the economic underpinnings of account-based and token-based systems are not determined solely by terminology, but by how state is encoded and manipulated.

Account-Based State

Each agent has a mutable balance, and transactions mutate balances via additions and subtractions.
The global state is a mapping from identity to balance.
Verification requires authenticating the initiator’s identity and confirming sufficient balance.

Token-Based State

Each unit of value is a discrete, transferrable object.
Ownership is implicit in possession of the token; verification requires only proving authenticity of the token, not the seller’s identity.

Chan’s extended, state-centric definition establishes that “token-based” and “account-based” nomenclatures should map directly to whether the system’s global state is object- or account-oriented, irrespective of whether it uses terms such as “token” in a marketing sense.

In practical terms, UTXO-based cryptocurrencies (e.g., Bitcoin) can be modeled as object-oriented, as all value resides in the set of unspent outputs. Nevertheless, because transaction authorization and double-spend prevention still require centralized, intermediated consensus, these systems implement significant features of account-based control, reflecting a nuanced SoT analysis.

3. SoT in Reasoning and LLMs

The SoT framework in LLM reasoning interprets intermediate “reasoning tokens” not as explanations but as an externalized computational state ensuring the persistence of information across stateless decoding steps (Levy et al., 14 Dec 2025). Each reasoning token can be viewed as a commit point in a growing state string. Key properties:

At each generation step, the prefix sequence of tokens forms the model's sole recoverable memory.
The model reads this state to reconstruct internal representations required for the next token prediction.
CoT (Chain-of-Thought) tokens serve as “state scaffolding,” facilitating correct reasoning even though their surface semantics as natural language text may not faithfully reflect the underlying computation.

Empirical studies have demonstrated that generated CoT text may omit crucial rationales, contradict earlier steps, or encode opaque subroutines not interpretable as natural language. By treating reasoning tokens as state rather than log, SoT provides a coherent explanation for these discrepancies (Levy et al., 14 Dec 2025).

4. SoT in Transformer Classification Heads

“SoT” also designates the “Second-Order Transformer” classification head in vision and LLMs (Xie et al., 2021). Here, the SoT head leverages both the classification token and high-level word/patch tokens from the Transformer’s output to form a pooled second-order state.

The head utilizes Multi-Headed Global Cross-Covariance Pooling with singular value power normalization (MGCrP), computing cross-covariance statistics between sets of token vectors.
Fused pooled token features and the classification token produce the final class logits via a “sum” fusion.
Empirical results show significant gains in both accuracy and robustness in vision (ImageNet, ImageNet-A) and language (CoLA, RTE) benchmarks, especially when second-order statistics are included.

This application of SoT underscores the utility of extracting and fusing state over entire token sets, extending beyond what a single pooled vector can represent.

5. Empirical Evidence and Comparative Analyses

A range of empirical analyses solidify the SoT principle across domains:

In digital currencies (Chan, 2021), mapping out UTXO-based and account-based models demonstrates specific trade-offs in traceability, double-spend defense, transaction semantics, and system auditability.
In LLMs (Levy et al., 14 Dec 2025), interventions and ablations on reasoning token sequences change model outputs in ways inconsistent with pure narrative explanation, but consistent with the tokens serving as computational state.
In Transformer states (Pal et al., 2023), “Future Lens” experiments show that hidden state vectors at particular layers can anticipate multi-token futures, indicating that tokens encode substantial latent trajectory state rather than only immediate next-token information.

Table: SoT Instances by Domain

Domain	SoT Role	Reference
Digital currencies	State representation: object- vs. account-orientation	(Chan, 2021)
LLM reasoning	Externalized state: reasoning token accumulation	(Levy et al., 14 Dec 2025)
Vision/LLMs	Pooling/fusion of token state for classification	(Xie et al., 2021)

6. Research Questions, Implications, and Future Directions

The SoT paradigm raises a series of new research questions:

What minimally sufficient token subsets anchor reasoning (“thought-anchoring”) in LLMs, and can their encoding schemes be reverse-engineered? (Levy et al., 14 Dec 2025)
Could state be externalized in modalities other than natural language, such as dense vectors or structured logic? (Levy et al., 14 Dec 2025)
What are the trade-offs between state transparency (human interpretability) and efficiency or compactness of externalized state?
How do state representations evolve across Transformer layers, and what architectures can best exploit second-order (SoT) features? (Xie et al., 2021, Pal et al., 2023)
In blockchains, how do state-centric definitions affect regulatory, auditing, and scalability choices? (Chan, 2021)

A plausible implication is that future systems—in both computation and finance—will increasingly optimize not for token or log semantics per se, but for explicit, auditable, and efficiently manipulable global state representations.

7. Broader Significance

State over Tokens reframes fundamental questions in computing machinery, interpretability, and digital assets. In cryptographic finance, SoT clarifies critical distinctions in system architecture and trust. In machine learning, SoT offers mechanistic insight into how multi-step reasoning emerges and can be controlled. Unified by their attention to stateful information encoded in sequences of tokens or objects, all SoT variants highlight the primacy of architectural and representational choices in determining system properties, transparency, and performance (Chan, 2021, Levy et al., 14 Dec 2025, Xie et al., 2021).