Strata–trajectory correspondence in RL latent spaces
Establish that distinct strata in the stratified token-embedding latent space of the Transformer-XL-based Proximal Policy Optimization agent trained on the Searing Spotlights two-coins environment correspond to different state–action trajectories, and determine whether increases in local dimension occur when the agent’s latent representation lies near intersections of strata.
References
We conjecture that distinct strata in the latent space correspond to different state-action trajectories, with increases in local dimension occurring when the agent is in a region where strata intersect.
— Exploring the Stratified Space Structure of an RL Game with the Volume Growth Transform
(2507.22010 - Curry et al., 29 Jul 2025) in Conclusion