Papers
Topics
Authors
Recent
2000 character limit reached

SP-Random Walk: Bayesian Foraging Model

Updated 24 November 2025
  • SP-Random Walk is a model for individual-based learning in foraging systems using MCMC sampling to update behavioral parameters.
  • It employs Bayesian posterior updates via a Metropolis–Hastings algorithm to balance exploration and exploitation in adapting to dynamic resource landscapes.
  • The framework shows that optimal foraging performance is achieved at intermediate canalization levels, maximizing energy intake while adapting to rapid environmental changes.

SP-Random Walk (Self-Plastic Random Walk) refers to individual-based learning and adaptation in foraging systems, where an agent's behavioral strategy is iteratively updated via Markov Chain Monte Carlo (MCMC) sampling guided by energetic feedback from environmental interactions. The term, introduced in "Simulating how animals learn: a new modelling framework applied to the process of optimal foraging" by Thompson et al., designates a Monte Carlo–driven stochastic search on the space of behavioral parameters, embedding both learning and random walk–style spatial decisions within a Bayesian machinery (Thompson et al., 2022). This framework yields a mathematically explicit, statistically principled account of how animals can optimize foraging in dynamic, uncertain landscapes through simulated sampling of alternative behaviors.

1. Mathematical Framework and Parameterization

Thompson et al. model an animal’s foraging strategy as a parameter vector θ=(β,γ,q,h)\theta = (\beta, \gamma, q, h), where:

  • β0\beta \ge 0: memory decay rate (how quickly memories of past patch qualities overwrite).
  • γ0\gamma \ge 0: spatial movement bias (relative preference for nearby over distant sites).
  • q[0,1]q \in [0,1]: naivety/default expectation for the value of unvisited patches.
  • h[0,1]h \in [0,1]: preference weight for resource type Q1Q_1 versus Q2Q_2.

The model assumes a uniform prior p(θ)p(\theta) over all admissible combinations of these behavioral parameters. For any given θ\theta, simulation produces NavgN_{\text{avg}} independent foraging trajectories, each with TtrainT_{\text{train}} learning steps and TtestT_{\text{test}} test steps. The realized net energetic intake for trajectory ii is

fi(θQ)=1Ttest[t=Ttrain+1Ttrain+TtestQ(xt,t)vt=Ttrain+1Ttrain+Ttestd(xt,xt1)]f_i(\theta|Q) = \frac{1}{T_{\text{test}}} \left[ \sum_{t = T_{\text{train}}+1}^{T_{\text{train}} + T_{\text{test}}} Q(x_t, t) - v \sum_{t = T_{\text{train}}+1}^{T_{\text{train}} + T_{\text{test}}} d(x_t, x_{t-1}) \right]

where Q(xt,t)Q(x_t, t) is the patch resource, vv is per-unit energetic travel cost, and d(xt,xt1)d(x_t, x_{t-1}) is distance. Averaging over samples yields the pseudo-likelihood f(θQ)=1Navgifi(θQ)f(\theta|Q) = \frac{1}{N_{\text{avg}}} \sum_i f_i(\theta|Q), driving the Bayesian update.

The joint posterior is constructed as

p(θdata)p(θ)[f(θQ)]kp(\theta | \text{data}) \propto p(\theta) \left[f(\theta|Q)\right]^k

with k>0k > 0 a canalization index. Small kk yields highly exploratory, plastic learners (flat, diffuse posteriors); large kk yields canalized, nearly deterministic exploitation.

Sampling from this posterior employs a Metropolis–Hastings random walk in parameter space: proposals θ\theta' are generated (e.g., Gaussian jumps), accepted with probability

α(θ(n)θ)=min{1,p(θ)[f(θQ)]kq(θ(n)θ)p(θ(n))[f(θ(n)Q)]kq(θθ(n))}\alpha(\theta^{(n)} \to \theta') = \min \left\{ 1, \frac{p(\theta') [f(\theta'|Q)]^{k} q(\theta^{(n)} | \theta')}{p(\theta^{(n)}) [f(\theta^{(n)}| Q)]^{k} q(\theta' | \theta^{(n)})} \right\}

Symmetric proposals reduce this to the ratio of pseudo-likelihoods (weighted intakes) and priors.

2. Foraging Simulation and Cognitive Mapping

Agents experience a spatial landscape—continuous 100 × 100 torus—with temporally evolving resource fields Q1(x,t),Q2(x,t)Q_1(x, t), Q_2(x, t). Each agent maintains a cognitive map C(x,t)C(x, t) based on:

  • Instantaneous perception: p(x,y)=exp(d(x,y)/ρ)p(x, y) = \exp(-d(x, y)/\rho).
  • Exponentially decaying memory: m(τ)=exp(βτ)m(\tau) = \exp(-\beta\tau).
  • Default expectation: qq for never-visited patches.
  • Preference weighting: Q~(x,t)=hQ1(x,t)+(1h)Q2(x,t)\tilde Q(x, t) = h Q_1(x, t) + (1-h) Q_2(x, t).

At each decision point, candidate next sites are drawn according to a step-length distribution governed by γ\gamma (e.g., exponential or gamma), with turning angle from a von Mises distribution. Site selection is then stochastic, with probability proportional to C(x,t)λC(x, t)^\lambda; large λ\lambda strongly biases the move toward high-perceived-quality patches. Moving toward a selected POI is a correlated random walk, but opportunistic patch switches are allowed if intervening sites exceed the expected quality.

Resource depletion and regrowth are local processes; visiting reduces Q(x,t)Q(x, t) by dLd_L, which recovers at rate rLr_L per step.

3. Learning Dynamics, Canalization, and Plasticity

The Metropolis–Hastings SP-Random Walk iterates as follows:

  1. Propose a new behavioral parameter vector θ\theta' via a random perturbation of current θ\theta.
  2. Simulate foraging under θ\theta', compute f(θQ)f(\theta'|Q) over NavgN_{\text{avg}} tracks.
  3. Accept or reject θ\theta' based on the acceptance probability given previously.
  4. Repeat for NiterN_{\text{iter}} iterations, with an initial burn-in period NburnN_{\text{burn}} discarded for transients.

The exponent kk modulates exploitation/exploration: low kk (plasticity) admits broad, multi-modal distributions and fast adaptation after environmental shifts, but high within-chain variance; high kk (canalization) locks to peaks in f(θQ)f(\theta|Q), maximizing near-optimal foraging when conditions are static but hindering adaptation to change. Performance is maximized at intermediate values: excessive plasticity leads to intermittent poor bouts, while rigidity locks agents into suboptimal strategies after sudden environmental reconfiguration (Thompson et al., 2022).

4. Computational Details and Convergence

  • Each SP-Random Walk chain is run for Niter=2000N_{\text{iter}} = 2000 steps, with Nburn=500N_{\text{burn}}=500 for burn-in.
  • Convergence is assessed using earth-mover's distance between marginal posteriors for independent chains under the same scenario.
  • In all tested settings, Niter=2000N_{\text{iter}}=2000 suffices for stationary posteriors.
  • Parameter updates are performed on transformed scales to ensure positivity and constraint compliance; proposals use symmetric kernels (e.g., Gaussian in logit or log space).

5. Key Results, Insights, and Biological Interpretation

Under static conditions, SP-Random Walk recovers classic predictions of foraging theory:

  • High-canalization (k1k \gtrsim 1): unimodal posteriors, deterministic choice of high-f(θQ)f(\theta|Q) parameters, rapid convergence.
  • Low canalization/plasticity: broad θ\theta-distribution, greater behavioral variability, higher rates of suboptimal bouts, but superior adaptation to abrupt environmental transitions (resource redistribution or swapping).
  • Across all environmental settings, agents prefer highly concentrated resources even if less abundant, aligning with ideal free distribution theory: high quality outweighs raw abundance (Thompson et al., 2022).

When landscape statistics change abruptly:

  • Canalized agents persist in outdated strategies; plastic agents adapt by broadening behavioral sampling.
  • Mean net energetic intake is maximized at intermediate kk, confirming a theoretical trade-off between robustness and opportunism.

The learning architecture requires very few biological assumptions: no explicit memory outside parameter updating, no direct encoding of strategies, and no explicit cost-of-movement or cognitive load (other than via energetic returns). It is thus extensible to multiple ecological and cognitive tasks.

6. Connections, Extensions, and Limitations

Assumptions:

  • Perception decays strictly exponentially with distance.
  • Memory decay is purely exponential.
  • Unvisited patches all share the same default quality qq.
  • Channel parameters (λ\lambda, γ\gamma, β\beta) are fixed within each chain.

Omitted dynamics:

  • No explicit inter-agent competition, predation, social learning, or genetic adaptation.
  • Can be immediately extended to any learning problem expressible as "simulator + pseudo-likelihood": e.g., collective cognition, decision-making under risk, economic choices (Thompson et al., 2022).

The SP-Random Walk MCMC protocol provides a unified bridge between Bayesian statistical inference, stochastic simulations of animal learning, and landscape-level foraging optimization, with extensive parameterization possible for environmental heterogeneity, sensory noise, and memory implementations.

7. Table: Core Components of the SP-Random Walk Model

Component Description Mathematical Implementation
Strategy vector Behavioral parameters (β,γ,q,h)(\beta, \gamma, q, h) Uniform prior p(θ)p(\theta) on defined hypercube
Intake function Energetic net return for behavior θ\theta f(θQ)f(\theta|Q) from IBM simulation over TT steps
Posterior update Combines prior and “pseudo-likelihood” p(θdata)p(θ)[f(θQ)]kp(\theta | \text{data}) \propto p(\theta)[f(\theta|Q)]^k
Parameter proposal Random walk in parameter space Symmetric proposal kernel (Gaussian, log/logit spaces)
Acceptance rule Metropolis–Hastings acceptance probability Ratio of weighted f(θQ)f(\theta|Q) and priors

The SP-Random Walk thus formalizes behavioral adaptation and environmental learning as an explicit, Bayesian, energetically guided exploration of behavioral-parameter space through repeated, simulator-driven random walks in θ\theta, with foraging performance acting as a pseudo-likelihood (Thompson et al., 2022).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)
Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to SP-Random Walk.