Papers
Topics
Authors
Recent
Search
2000 character limit reached

Contextual Control without Memory Growth in a Context-Switching Task

Published 3 Apr 2026 in cs.AI, cs.IT, and cs.LG | (2604.03479v1)

Abstract: Context-dependent sequential decision making is commonly addressed either by providing context explicitly as an input or by increasing recurrent memory so that contextual information can be represented internally. We study a third alternative: realizing contextual dependence by intervening on a shared recurrent latent state, without enlarging recurrent dimensionality. To this end, we introduce an intervention-based recurrent architecture in which a recurrent core first constructs a shared pre-intervention latent state, and context then acts through an additive, context-indexed operator. We evaluate this idea on a context-switching sequential decision task under partial observability. We compare three model families: a label-assisted baseline with direct context access, a memory baseline with enlarged recurrent state, and the proposed intervention model, which uses no direct context input to the recurrent core and no memory growth. On the main benchmark, the intervention model performs strongly without additional recurrent dimensions. We also evaluate the models using the conditional mutual information (I(C;O | S)) as a theorem-motivated operational probe of contextual dependence at fixed latent state. For task-relevant phase-1 outcomes, the intervention model exhibits positive conditional contextual information. Together, these results suggest that intervention on a shared recurrent state provides a viable alternative to recurrent memory growth for contextual control in this setting.

Authors (1)

Summary

  • The paper demonstrates that context-dependent control is achieved by applying context-indexed interventions on a shared recurrent state, eliminating the need for explicit memory growth.
  • Experimental results show that the intervention model attains near-perfect success rates and measurable conditional contextual information, rivaling traditional memory-based architectures.
  • The study validates information-theoretic predictions and highlights the practical benefits of lightweight, interpretable modulation mechanisms for sequential decision-making.

Intervention-Based Contextual Control Without Memory Growth in Context-Switching Tasks

Problem Setting and Motivation

The study addresses the challenge of context-dependent decision making under partial observability, focusing on how agents can express contextual dependence without explicit context input or increasing recurrent memory. In typical POMDP settings, context is handled either by providing explicit context tokens as input or by expanding the recurrent state to encode contextual information, both of which have architectural and efficiency trade-offs.

The authors propose an alternative: context-dependent control realized via context-indexed intervention on a shared recurrent latent state, avoiding increases in recurrent dimensionality. This premise is motivated by recent information-theoretic work on contextuality and resource requirements in single-state models (Kim, 3 Feb 2026), illuminating the non-necessity of state-space growth for maintaining contextual dependence at the level of observable outcomes. Figure 1

Figure 1: Problem setup illustrating phase-dependent goal switching in a partially observable maze, with agent observations limited to a 3×33\times3 field of view and context signaled via a token.

Architectural Approaches and Model Families

Three model families are compared within a minimal partially observable gridworld benchmark that imposes a within-episode context switch dictating different goal targets for each phase:

  • Label-Assisted Baseline (L): Direct context concatenation as input; serves as an upper-bound oracle.
  • Memory Baseline (M): No direct context input; recurrent latent state enlarged (dim(zt)=d+m\dim(z_t)=d+m) to carry context implicitly.
  • Intervention Model (I): Context is withheld from the recurrent core; instead, a residual, context-indexed linear operator perturbs the recurrent latent state: zt=zt+αDct(zt)z_t' = z_t + \alpha D_{c_t}(z_t), where DctD_{c_t} is a learned, bias-free linear map.

The agent must adapt its policy to target G1G1 or G2G2 depending on both phase and condition (AB or BA order). Figure 2

Figure 2: Architectural diagram of the three model families, highlighting the separation of context access and memory growth mechanisms.

Experimental Results

Performance is evaluated on benchmark conditions AB25 and BA30, measuring the fraction of seeds that successfully solve both task phases and analyzing phase-specific success rates. The training protocol ensures matched base recurrent dimensionality, differing only in how context is integrated.

Key empirical findings:

  • Label-assisted (L) achieves perfect success ($10/10$ seeds) on both benchmarks.
  • Intervention (I) achieves $9/10$ (AB25) and $7/10$ (BA30), with no increase in recurrent state.
  • Memory baselines show non-monotonic scaling: Best performance at m=16m=16 (M16: dim(zt)=d+m\dim(z_t)=d+m0 AB25, dim(zt)=d+m\dim(z_t)=d+m1 BA30), but larger dim(zt)=d+m\dim(z_t)=d+m2 does not guarantee improvement; M8, M32, and M64 are all outperformed by M16 in BA30.
  • Phase 1 is consistently harder across architectures, indicating memory growth does not guarantee better contextual control in the harder context-dependent sub-task. Figure 3

    Figure 3: Comparative performance (fraction of successful seeds) of model families under AB25 and BA30 benchmarks.

These results challenge the sufficiency of naive memory growth for robust contextual dependency, supporting the intervention architecture as a competitive, parameter-efficient solution in this setting.

Information-Theoretic Evidence

An operational probe, dim(zt)=d+m\dim(z_t)=d+m3, quantifies context dependence at fixed recurrent latent state dim(zt)=d+m\dim(z_t)=d+m4. Estimators use counterfactual distributions over context to compute conditional mutual information for task-relevant outcomes (e.g., hitting the context-appropriate target).

  • Positive dim(zt)=d+m\dim(z_t)=d+m5 is observed in all models for phase-1 goal-related outcomes, with the intervention model matching or exceeding memory and label-assisted baselines within sampling error:
    • For AB25 (target_hit): L: dim(zt)=d+m\dim(z_t)=d+m6, I: dim(zt)=d+m\dim(z_t)=d+m7, M16: dim(zt)=d+m\dim(z_t)=d+m8 bits.
    • For BA30 (goal3): I: dim(zt)=d+m\dim(z_t)=d+m9 bits.
  • Primitive-outcome definitions (one-step actions) yield much lower estimates, highlighting that context-driven distinctions manifest primarily at abstract, goal-level events.

Thus, context remains observable in the outcome distribution even after conditioning on latent state, consistent with the theoretical expectation that contextuality can be preserved without explicit memory extension (Kim, 3 Feb 2026).

Implications, Limitations, and Future Directions

The findings have practical and conceptual implications:

  • Architectural Efficiency: Contextual dependence can be achieved without explicit context input or increased recurrent state, suggesting recurrent architectures with intervention-based modulation could offer a scalable alternative to ever-larger recurrent cores.
  • Interpretability and Modularity: Context-indexed interventions produce interpretable, structured modulation of latent representations, potentially benefiting factorization of roles such as phase-switching, task selection, or tool reuse in larger agentic systems.
  • Theoretical Alignment: Positive zt=zt+αDct(zt)z_t' = z_t + \alpha D_{c_t}(z_t)0 values support recent information-theoretic theorems about contextuality, though the empirical setting does not instantiate all formal assumptions of those results.

Limitations include restriction to a controlled, two-phase benchmark, limited generalization claims to unseen switch times, and a focus on a simple residual linear intervention. Broader evaluation across richer context structures, more complex environments, modulation schemes beyond linearity, and tighter empirical estimators would clarify the generality and limitations of these findings.

Future research should explore:

  • Application of intervention-based modulation mechanisms in large-scale RL/sequence models.
  • Analysis of latent geometry before and after intervention across tasks with hierarchical or compositional context.
  • Direct comparisons with alternative conditional architectures, such as FiLM [Perez et al., AAAI 2018] or hypernetwork-style modulation, for contextuality expressiveness and efficiency.

Conclusion

This study demonstrates that context-dependent control in sequential decision tasks can be realized by intervention on a shared recurrent latent state, obviating the need for recurrent memory growth or direct context concatenation. The intervention model matches the best memory-based baseline in benchmark performance and exhibits measurable conditional contextual information, supporting both practical efficacy and alignment with information-theoretic predictions of contextual dependence. These results argue for the utility of intervention-based architectures as a lightweight, interpretable primitive in both compact and scalable agent designs.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.