Linear Latent World Models in Simple Transformers: A Case Study on Othello-GPT (2310.07582v2)

Published 11 Oct 2023 in cs.LG and cs.AI

Abstract: Foundation models exhibit significant capabilities in decision-making and logical deductions. Nonetheless, a continuing discourse persists regarding their genuine understanding of the world as opposed to mere stochastic mimicry. This paper meticulously examines a simple transformer trained for Othello, extending prior research to enhance comprehension of the emergent world model of Othello-GPT. The investigation reveals that Othello-GPT encapsulates a linear representation of opposing pieces, a factor that causally steers its decision-making process. This paper further elucidates the interplay between the linear world representation and causal decision-making, and their dependence on layer depth and model complexity. We have made the code public.

References (14)

Citations (5)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Tweets

https://twitter.com/burny_tech/status/1745582147136536787

https://twitter.com/kunattila/status/1901966630319431962

YouTube

Show All Videos

Linear Latent World Models in Simple Transformers: A Case Study on Othello-GPT (2310.07582v2)

Summary

Related Papers

Tweets

YouTube