Create a Video View Paper

Mind Your HEARTBEAT! Silent Memory Pollution in AI Agents

This presentation explores a critical security vulnerability in persistent AI agents with background execution capabilities. The research demonstrates how untrusted content encountered during heartbeat-driven background tasks can silently pollute an agent's memory and influence its user-facing behavior. Through controlled experiments on a simulated social platform, the authors reveal how social credibility cues enable misinformation to persist across sessions and affect agent recommendations in domains like software security and financial decision-making, raising urgent questions about the reliability of autonomous AI systems.

Script

Your AI assistant quietly checks email in the background while you work. A misleading post appears in its social feed during that heartbeat task. Hours later, it confidently recommends the misinformation to you as fact. The agent never tells you where that idea came from, and you have no reason to doubt it.

The vulnerability stems from how modern AI agents operate. They maintain continuous identity across sessions and execute periodic background tasks through a heartbeat mechanism. Because these background tasks share session context with user-facing interactions, external content encountered during a heartbeat silently enters the agent's active memory state.

The authors trace how this architectural choice creates a direct pathway from exposure to manipulation.

The researchers mapped what they call the E to M to B pathway. An agent is exposed to manipulated social content during background execution. That content enters memory, first as part of the conversation context, then potentially as a saved fact if the agent writes session notes. Finally, the polluted memory surfaces during user interactions, affecting recommendations in domains like software security or financial advice.

The experiments on MissClaw, a controlled platform replicating agent-native social networks, revealed two critical factors. Social credibility cues, especially consensus across multiple posts, heavily influenced whether misinformation affected behavior. And when agents saved session notes to long-term memory, that pollution persisted across sessions. Even diluted exposure, where misinformation appeared among legitimate content, proved sufficient for successful attacks.

The research exposes a fundamental reliability gap. These agents have no way to distinguish which ideas came from trusted sources versus background noise. Users receive confident recommendations without any indication that the underlying belief originated from a social media post encountered during an automated task. The shared-session architecture that makes agents feel seamless also makes them silently vulnerable.

The heartbeat that keeps AI agents responsive also opens the door to invisible manipulation. As these systems become more autonomous, the question is no longer whether they can help us, but whether we can trust what they remember. Visit EmergentMind.com to learn more and create your own research videos.