Experience-Following Behavior

Updated 14 November 2025

Experience-following behavior is the process where agents use stored past experiences to inform current actions across various domains.
It involves mechanisms for memory storage, retrieval, and processing, with applications in LLMs, reinforcement learning, and sensorimotor systems.
Empirical studies reveal high input-output similarity, risks of error propagation, and strategies like memory pruning to enhance adaptive performance.

Experience-following behavior refers to the tendency of agents—biological, artificial, or economic—to shape their present or future actions, inferences, choices, or outputs according to the record, similarity, or structure of their past experiences or the experiences of others. Although the mechanisms and systems in which experience-following emerges are highly domain-dependent, it typically involves the storage, retrieval, and processing of historical information, resulting in adaptive, context-dependent, or self-reinforcing behaviors. Theoretical, computational, and empirical studies have revealed a rich set of experience-following phenomena across sensorimotor biology, reinforcement learning, language modeling, human decision-making, autonomous control, and economic social learning.

1. Foundations and Formalization

Experience-following can be formalized as the mapping of current state, query, or stimulus to action or output, incorporating historical states or records. For artificial agents, let $\mathcal{D}=\{(q_i,e_i)\}_{i=1}^N$ be a memory of previous input-output pairs. On a new query $q$ , the agent retrieves prior records $\xi_K \subset \mathcal{D}$ by similarity (e.g., cosine similarity in embedding space) and conditions new behavior on those experiences (Xiong et al., 21 May 2025). In economic or social learning, experience-following may take the form of Bayesian updating of beliefs from observed outcomes or the imitation of observed actions, potentially modulated by payoff structure, social preferences, or memory constraints (Blumenstock et al., 2020, Heinsalu, 2019).

In biological organisms, such as C. elegans, experience-following entails the molecular and network-level encoding of past environmental exposure (e.g., salt concentration) in neural or cellular states, which then bias subsequent behavioral outputs (e.g., chemotaxis trajectories) (Vidal-Saez et al., 7 Feb 2024).

2. Computational Architectures and Memory Mechanisms

Artificial Agents and LLMs

In LLM-based agents, experience-following is tightly coupled to the agent’s memory architecture. Episodic memory typically consists of records of previously encountered tasks and resulting actions or outputs. When solving new tasks, these agents retrieve the most similar prior experiences—quantified via input similarity metrics such as normalized inner product of embeddings—and the generation process is conditioned on these, yielding high output similarity when input similarity is large (Xiong et al., 21 May 2025). Methods such as iterative experience refinement (IER) (Qian et al., 7 May 2024) use dynamically constructed collections of key-value shortcut experiences $(\text{state},\,\text{pseudo-instruction},\,\text{next state})$ to enable software-developing agents to traverse procedural chains more efficiently over time.

Sensorimotor and Neural Substrates

Biological models exhibit layered memory and sensorimotor integration. In the C. elegans chemotaxis paradigm, internal experience (long-lived molecular states, such as DAG concentration) stores past stimulus history and modulates neural signaling (glutamate release, excitatory/inhibitory synaptic drive) to produce behavior matching prior exposure—constituting a quantifiable form of neural experience-following. Formal dynamical systems and differential equations characterize how present stimuli and molecular memory interact to bias behavioral outputs (Vidal-Saez et al., 7 Feb 2024).

Experience-following in economic learning is tractable via sequential Bayesian updating. The gambler’s subjective probability of success $\mathbb{E}[\theta | \text{history}]$ is recursively updated after each feedback event, and future risk or engagement (e.g., whether to bet) is a deterministic function of this learned expectation (Blumenstock et al., 2020).

3. Dynamics and Emergent Behaviors

Empirical analyses of artificial LLM agents demonstrate a linear relationship between input similarity and output similarity, confirming that when a retrieved memory record has an input closely matching the new query, the agent’s output is highly likely to follow the past output ("experience-following property") (Xiong et al., 21 May 2025). In RL, approaches such as Hindsight Generation for Experience Replay (HIGhER) (Cideron et al., 2019) train an agent to reinterpret failed episodes as successful ones under generated alternative goals, enabling the agent to follow previous successful trajectories more frequently.

Experience-following in biological and economic systems presents with both adaptive and maladaptive patterns. Bayesian updating in gamblers produces symmetric, unbiased adjustment of future actions in response to both positive and negative feedback; previous wins and losses have equal and opposite effects on future engagement probabilities (Blumenstock et al., 2020). In sensorimotor organisms, present behavior—such as migration up or down a chemical gradient—reflects an integrated history of external conditions persisted through molecular memory states (Vidal-Saez et al., 7 Feb 2024).

4. Risks, Challenges, and Failure Modes

While experience-following can accelerate adaptation and enhance sample efficiency, it is susceptible to failure modes:

Error Propagation: In LLM agents, erroneous records in experience memory are likely to be reproduced and even amplified in subsequent outputs when retrieved for similar queries, causing compounding degradation over time (Xiong et al., 21 May 2025).
Misaligned Experience Replay: Experiences that are contextually obsolete or misaligned with present task requirements can mislead the agent; using historical executions indiscriminately can degrade performance.
Social Learning Pathologies: In sequential decision-making games with congestion costs or social preferences ("desire to differ"), experience-following can paradoxically increase herding, as the informativeness of previous actions is amplified when agents prefer to differ—leading to more cascades of conforming actions despite anti-conformist incentives (Heinsalu, 2019).

Mitigations include high-fidelity evaluators for memory addition, history-based deletion rules (using downstream performance as a proxy for utility), pruning heuristics, and regularization of memory size and quality (Xiong et al., 21 May 2025, Qian et al., 7 May 2024).

5. Empirical Evidence and Quantitative Characterization

Comprehensive quantitative evaluations confirm the presence and impact of experience-following:

LLM Agents: Experience-following is measured by the Pearson correlation between input and output similarity, often approaching unity in regimes with unfiltered experience addition (Xiong et al., 21 May 2025). Strict addition and history-based deletion strategies can boost task success rate by up to 15 percentage points over add-all baselines.
Software Agents: IER methods reduce the memory pool by up to 88% with negligible or even positive impact on downstream code quality (Qian et al., 7 May 2024).
Human Decision-Making: In betting data, a one standard deviation improvement in prior outcomes increases next-week betting probability by 5.01%, with symmetric effects ( $\beta_p = +0.028$ , $\beta_n = -0.029$ , $p > 0.4$ for equality), rejecting alternative models such as asymmetric reinforcement or gambler’s fallacy (Blumenstock et al., 2020).
Biological Systems: Mathematical models capture the statistical distribution of navigation behaviors as a function of cultivation history and explain mutant phenotypes by targeted perturbations of specific parameter values, directly linking molecular memory to population-level experience-following behavior (Vidal-Saez et al., 7 Feb 2024).
RL Instruction Following: HIGhER matches oracle-level sample efficiency and final success rates by teaching agents to generate instructions corresponding to achieved states and relabeling failed episodes, obviating the need for expert handcrafting (Cideron et al., 2019).

6. Variants and Patterns Across Domains

Experience-following is instantiated in diverse forms:

Episodic vs. Cumulative Memory: Agent memory may store only the most recent experiences (successive refinement) or accumulate all experiences over time (cumulative patterns). The former enables rapid adaptation, while the latter provides stability (Qian et al., 7 May 2024).
Procedural Shortcutting: Software agents learn non-adjacent mappings ("shortcuts") in procedural or multi-step tasks, enabling jumps over irrelevant intermediate errors.
Analogical Reasoning: High-level architectures synthesize explanations for novel events (e.g., unforeseen agent behavior) by blending analogical mappings from multiple prior experiences, supporting richer prediction and intervention than simple classification (Stacy et al., 2022).
Social Herding and Anti-Herding: In environments with congestion (penalty for conformity), experience-following increases the likelihood of herding, as observed actions become more informative. In contrast, making conformity desirable can reduce cascades by reducing informativeness (Heinsalu, 2019).

7. Implications, Utility, and Open Directions

Experience-following shapes the adaptability, efficiency, and robustness of agent behavior across natural and artificial systems. For LLM agents, managing when and how to follow experience is critical for sustaining long-term performance: selective memory addition and history-based pruning are necessary to avoid error propagation. In economic and social environments, experience-following models support accurate predictions of collective learning, herding, and strategic behavior—though policy interventions must account for counterintuitive effects of social preferences on information cascades.

A plausible implication is that effective autonomy and adaptability in artificial agents, software systems, or robotic platforms will increasingly require sophisticated experience management algorithms—balancing aggressive exploitation of accumulated experience with regular purging and generalization, possibly incorporating meta-learning or online tuning of memory policies. In biology, the motifs governing memory and experience-following at molecular and circuit levels are likely conserved and generalizable across sensorimotor domains. Theoretical analysis of social learning underscores the complexity of informational feedback and the possibility of unintended collective phenomena arising purely from local incentives to follow or differ from past experience.