- The paper introduces a reinforcement learning-based normative model that quantifies the trade-off between attention costs and reward-driven task performance.
- The model shows that agents strategically alternate between high and low attention, closely emulating experimental behavior under varying reward conditions.
- The study reveals that rhythmic patterns in attention deployment emerge when disengagement yields no sensory gain, offering novel predictions about neural dynamics.
The paper "Attention when you need" explores strategic allocation of attention in tasks where attention has both benefits in improving task performance and costs associated with its deployment. Using a reinforcement learning-based normative model, the work investigates this balance by simulating a task involving mice, inspired by experiments conducted by de Gee et al. The task requires discerning a high-order acoustic feature amidst noise, with attention costs varying with the trial duration and magnitude of reward.
Key components of the model include:
- States and Actions: The latent world states transition through noise and signal phases, and the agent (the simulated mouse) can choose whether to lick (indicating decision) and whether to pay low or high attention at each discrete time step. This decision influences the subsequent observations.
- Observations and Rewards: Binary observations with stochastic noise are provided, where high attention incurs metabolic costs but yields more reliable observations. The agent receives rewards based on the timing of licks during the signal phase, and false alarms (licks during noise) result in penalties.
The paper shows that the agent allocates attention economically, shifting between blocks of high and low attention. When presented with the option to disengage completely during certain phases (allowing zero information gain, plow​=0.5), rhythmic attention patterns emerge. This suggests that agents devise strategies to optimize task utility, which aligns with observations of attention rhythms in neuroscience literature.
Important results include:
- Behavioral Emulation: The trained agent displays behavioral trends akin to experimental observations—higher food rewards lead to increased hit rates, false alarms, and reduced reaction times.
- Attention Dynamics: The agent's decision to deploy high attention is sensitive to its belief in the presence of the signal and the magnitude of the reward. Increased rewards and signal certainty encourage higher attention.
- Rhythmic Patterns: When disengagement provides no sensory information, high attention instances follow rhythmic patterns, with wait times between blocks of high attention being regular. This temporal structure is driven by cost constraints and suggests optimization principles in attention deployment.
The paper suggests that periodic attention patterns previously observed in vision tasks may be normatively justified by the costs of attention. The results also propose empirical predictions regarding neural and physiological correlates of attention, advocating that fluctuations seen in neural signals might correspond to the fluctuating strength of evidence as attention varies. These findings may provide insights into understanding attention lapses and rhythmic attention deployment in animal and human behaviors.
The work underscores the need for future research to refine models with continuous observation spaces, incorporate switching costs between attention states, and further explore how attention policies evolve in different contexts. The model could have broader implications for understanding cognitive processes across various modalities and conditions where attention is a costly resource.