SP-Random Walk: Bayesian Foraging Model

Updated 24 November 2025

SP-Random Walk is a model for individual-based learning in foraging systems using MCMC sampling to update behavioral parameters.
It employs Bayesian posterior updates via a Metropolis–Hastings algorithm to balance exploration and exploitation in adapting to dynamic resource landscapes.
The framework shows that optimal foraging performance is achieved at intermediate canalization levels, maximizing energy intake while adapting to rapid environmental changes.

SP-Random Walk (Self-Plastic Random Walk) refers to individual-based learning and adaptation in foraging systems, where an agent's behavioral strategy is iteratively updated via Markov Chain Monte Carlo (MCMC) sampling guided by energetic feedback from environmental interactions. The term, introduced in "Simulating how animals learn: a new modelling framework applied to the process of optimal foraging" by Thompson et al., designates a Monte Carlo–driven stochastic search on the space of behavioral parameters, embedding both learning and random walk–style spatial decisions within a Bayesian machinery (Thompson et al., 2022). This framework yields a mathematically explicit, statistically principled account of how animals can optimize foraging in dynamic, uncertain landscapes through simulated sampling of alternative behaviors.

1. Mathematical Framework and Parameterization

Thompson et al. model an animal’s foraging strategy as a parameter vector $\theta = (\beta, \gamma, q, h)$ , where:

$\beta \ge 0$ : memory decay rate (how quickly memories of past patch qualities overwrite).
$\gamma \ge 0$ : spatial movement bias (relative preference for nearby over distant sites).
$q \in [0,1]$ : naivety/default expectation for the value of unvisited patches.
$h \in [0,1]$ : preference weight for resource type $Q_1$ versus $Q_2$ .

The model assumes a uniform prior $p(\theta)$ over all admissible combinations of these behavioral parameters. For any given $\theta$ , simulation produces $N_{\text{avg}}$ independent foraging trajectories, each with $T_{\text{train}}$ learning steps and $T_{\text{test}}$ test steps. The realized net energetic intake for trajectory $i$ is

$f_i(\theta|Q) = \frac{1}{T_{\text{test}}} \left[ \sum_{t = T_{\text{train}}+1}^{T_{\text{train}} + T_{\text{test}}} Q(x_t, t) - v \sum_{t = T_{\text{train}}+1}^{T_{\text{train}} + T_{\text{test}}} d(x_t, x_{t-1}) \right]$

where $Q(x_t, t)$ is the patch resource, $v$ is per-unit energetic travel cost, and $d(x_t, x_{t-1})$ is distance. Averaging over samples yields the pseudo-likelihood $f(\theta|Q) = \frac{1}{N_{\text{avg}}} \sum_i f_i(\theta|Q)$ , driving the Bayesian update.

The joint posterior is constructed as

$p(\theta | \text{data}) \propto p(\theta) \left[f(\theta|Q)\right]^k$

with $k > 0$ a canalization index. Small $k$ yields highly exploratory, plastic learners (flat, diffuse posteriors); large $k$ yields canalized, nearly deterministic exploitation.

Sampling from this posterior employs a Metropolis–Hastings random walk in parameter space: proposals $\theta'$ are generated (e.g., Gaussian jumps), accepted with probability

$\alpha(\theta^{(n)} \to \theta') = \min \left\{ 1, \frac{p(\theta') [f(\theta'|Q)]^{k} q(\theta^{(n)} | \theta')}{p(\theta^{(n)}) [f(\theta^{(n)}| Q)]^{k} q(\theta' | \theta^{(n)})} \right\}$

Symmetric proposals reduce this to the ratio of pseudo-likelihoods (weighted intakes) and priors.

2. Foraging Simulation and Cognitive Mapping

Agents experience a spatial landscape—continuous 100 × 100 torus—with temporally evolving resource fields $Q_1(x, t), Q_2(x, t)$ . Each agent maintains a cognitive map $C(x, t)$ based on:

Instantaneous perception: $p(x, y) = \exp(-d(x, y)/\rho)$ .
Exponentially decaying memory: $m(\tau) = \exp(-\beta\tau)$ .
Default expectation: $q$ for never-visited patches.
Preference weighting: $\tilde Q(x, t) = h Q_1(x, t) + (1-h) Q_2(x, t)$ .

At each decision point, candidate next sites are drawn according to a step-length distribution governed by $\gamma$ (e.g., exponential or gamma), with turning angle from a von Mises distribution. Site selection is then stochastic, with probability proportional to $C(x, t)^\lambda$ ; large $\lambda$ strongly biases the move toward high-perceived-quality patches. Moving toward a selected POI is a correlated random walk, but opportunistic patch switches are allowed if intervening sites exceed the expected quality.

Resource depletion and regrowth are local processes; visiting reduces $Q(x, t)$ by $d_L$ , which recovers at rate $r_L$ per step.

3. Learning Dynamics, Canalization, and Plasticity

The Metropolis–Hastings SP-Random Walk iterates as follows:

Propose a new behavioral parameter vector $\theta'$ via a random perturbation of current $\theta$ .
Simulate foraging under $\theta'$ , compute $f(\theta'|Q)$ over $N_{\text{avg}}$ tracks.
Accept or reject $\theta'$ based on the acceptance probability given previously.
Repeat for $N_{\text{iter}}$ iterations, with an initial burn-in period $N_{\text{burn}}$ discarded for transients.

The exponent $k$ modulates exploitation/exploration: low $k$ (plasticity) admits broad, multi-modal distributions and fast adaptation after environmental shifts, but high within-chain variance; high $k$ (canalization) locks to peaks in $f(\theta|Q)$ , maximizing near-optimal foraging when conditions are static but hindering adaptation to change. Performance is maximized at intermediate values: excessive plasticity leads to intermittent poor bouts, while rigidity locks agents into suboptimal strategies after sudden environmental reconfiguration (Thompson et al., 2022).

4. Computational Details and Convergence

Each SP-Random Walk chain is run for $N_{\text{iter}} = 2000$ steps, with $N_{\text{burn}}=500$ for burn-in.
Convergence is assessed using earth-mover's distance between marginal posteriors for independent chains under the same scenario.
In all tested settings, $N_{\text{iter}}=2000$ suffices for stationary posteriors.
Parameter updates are performed on transformed scales to ensure positivity and constraint compliance; proposals use symmetric kernels (e.g., Gaussian in logit or log space).

5. Key Results, Insights, and Biological Interpretation

Under static conditions, SP-Random Walk recovers classic predictions of foraging theory:

High-canalization ( $k \gtrsim 1$ ): unimodal posteriors, deterministic choice of high- $f(\theta|Q)$ parameters, rapid convergence.
Low canalization/plasticity: broad $\theta$ -distribution, greater behavioral variability, higher rates of suboptimal bouts, but superior adaptation to abrupt environmental transitions (resource redistribution or swapping).
Across all environmental settings, agents prefer highly concentrated resources even if less abundant, aligning with ideal free distribution theory: high quality outweighs raw abundance (Thompson et al., 2022).

When landscape statistics change abruptly:

Canalized agents persist in outdated strategies; plastic agents adapt by broadening behavioral sampling.
Mean net energetic intake is maximized at intermediate $k$ , confirming a theoretical trade-off between robustness and opportunism.

The learning architecture requires very few biological assumptions: no explicit memory outside parameter updating, no direct encoding of strategies, and no explicit cost-of-movement or cognitive load (other than via energetic returns). It is thus extensible to multiple ecological and cognitive tasks.

6. Connections, Extensions, and Limitations

Assumptions:

Perception decays strictly exponentially with distance.
Memory decay is purely exponential.
Unvisited patches all share the same default quality $q$ .
Channel parameters ( $\lambda$ , $\gamma$ , $\beta$ ) are fixed within each chain.

Omitted dynamics:

No explicit inter-agent competition, predation, social learning, or genetic adaptation.
Can be immediately extended to any learning problem expressible as "simulator + pseudo-likelihood": e.g., collective cognition, decision-making under risk, economic choices (Thompson et al., 2022).

The SP-Random Walk MCMC protocol provides a unified bridge between Bayesian statistical inference, stochastic simulations of animal learning, and landscape-level foraging optimization, with extensive parameterization possible for environmental heterogeneity, sensory noise, and memory implementations.

7. Table: Core Components of the SP-Random Walk Model

Component	Description	Mathematical Implementation
Strategy vector	Behavioral parameters $(\beta, \gamma, q, h)$	Uniform prior $p(\theta)$ on defined hypercube
Intake function	Energetic net return for behavior $\theta$	$f(\theta\|Q)$ from IBM simulation over $T$ steps
Posterior update	Combines prior and “pseudo-likelihood”	$p(\theta \| \text{data}) \propto p(\theta)[f(\theta\|Q)]^k$
Parameter proposal	Random walk in parameter space	Symmetric proposal kernel (Gaussian, log/logit spaces)
Acceptance rule	Metropolis–Hastings acceptance probability	Ratio of weighted $f(\theta\|Q)$ and priors

The SP-Random Walk thus formalizes behavioral adaptation and environmental learning as an explicit, Bayesian, energetically guided exploration of behavioral-parameter space through repeated, simulator-driven random walks in $\theta$ , with foraging performance acting as a pseudo-likelihood (Thompson et al., 2022).

PDF Markdown Chat (Pro)

References (1)

Simulating how animals learn: a new modelling framework applied to the process of optimal foraging (2022)

Whiteboard

Generate a whiteboard explanation of this topic.

Follow Topic

Get notified by email when new papers are published related to SP-Random Walk.