Necessity of explicit configuration dictionary in LLM-based Turing machine simulations

Determine whether simulating deterministic multi-tape Turing machines and Turing machines with advice entirely within large language models by representing tape configurations as word-to-vector embeddings necessarily requires an explicit representation of the full dictionary of all required Turing-machine-configuration “words,” or whether alternative internal mechanisms of large language models can achieve the simulation without such explicit dictionary storage, and specify the architectural assumptions under which such improvements would be possible.

Background

The paper proposes simulations of deterministic multi-tape Turing machines (and Turing machines with advice) within LLMs by treating Turing machine configurations as words of a “Turing machine language” and storing these as word-to-vector embeddings. Theorem 3.1 and its corollary assert that for inputs of length n and space complexity S(n) within a bound k, an appropriately specialized LLM with embeddings of size O(k) can generate the correct sequence of configurations, effectively simulating the Turing machine’s computation.

Following these results, the authors raise whether such simulations must explicitly include the full dictionary of configuration words in the LLM’s embeddings, or whether the internal mechanisms of LLMs (e.g., attention or other architectural features) could avoid this requirement. Resolving this would clarify the minimal representational needs of LLMs for simulating space-bounded Turing machines and refine the resource trade-offs of the proposed approach.

References

It is an open problem whether the simulations from Theorem \ref{thm:interior} and Corollary \ref{cor:interior} can be improved. For instance, does the model need to explicitly represent the full dictionary of the necessary ``Turing machine words"? It seems to depend on the possibilities of the internal model of an LLM.

Large Language Models and the Extended Church-Turing Thesis (2409.06978 - Wiedermann et al., 11 Sep 2024) in Subsection 3.1 (Simulating Turing machines ‘inside’ of LLMs)