Papers
Topics
Authors
Recent
Search
2000 character limit reached

Non-Markovian MFGs: Memory and Dynamics

Updated 4 July 2026
  • Non-Markovian mean-field games are models where agents’ decisions depend on both current states and historical dynamics, integrating memory effects into strategic interactions.
  • These frameworks employ techniques such as time-fractional derivatives, path-dependent survivor measures, and generalized BSDEs to capture non-Markovian behavior.
  • They offer robust approaches for analyzing Nash equilibria in complex systems, including anomalous diffusion, absorption phenomena, and recursive utility settings.

Non-Markovian mean-field games are mean-field game models in which the representative agent’s optimization problem or the consistency condition depends on temporal history rather than only on the current state. In the supplied literature, non-Markovianity appears in several distinct forms: subdiffusive dynamics generated by an inverse stable subordinator and encoded by Caputo time-fractional derivatives (Qing et al., 2018); path-dependent interactions through the empirical sub-probability measure of survivors and the history of absorptions (Campi et al., 2019); weak-form games with coefficients depending on the joint distribution of states and controls on path space (Possamaï et al., 2021); and recursive utility portfolio games with Epstein-Zin preferences formulated without any Markov structure and characterized by BSDEs (Fu et al., 12 May 2025). Across these formulations, the common objective is to define Nash equilibria in a continuum of strategically interacting agents when memory, delayed effects, absorption history, or non-separable intertemporal preferences invalidate the classical dynamic programming paradigm.

1. Core formulations of non-Markovianity

A first class of non-Markovian mean-field games arises from anomalous diffusion. In the time-fractional formulation, the microscopic state process is obtained by time-changing a diffusion YsY_s with the inverse EtE_t of a strictly increasing β(0,1)\beta\in(0,1)-stable subordinator DtD_t. The resulting trajectory

Xt=YEt,dYs=v(s,Ys)ds+2dBsX_t=Y_{E_t}, \qquad dY_s=v(s,Y_s)\,ds+\sqrt{2}\,dB_s

is continuous, non-Markovian, non-Gaussian, and carries a long-term memory because EtE_t “pauses” the motion for random heavy-tailed times (Qing et al., 2018). In this setting, the law m(t,x)m(t,x) satisfies a time-fractional Fokker-Planck equation, and equilibrium is described by a coupled time-fractional HJB-FP system.

A second class is path dependence through absorption. In the framework of games with smooth dependence on past absorptions, each player’s trajectory lives in X=C([0,T];Rd)X=C([0,T];\mathbb{R}^d), with absorption time

τφ=inf{t[0,T]:φ(t)O}T,\tau^\varphi=\inf\{t\in[0,T]:\varphi(t)\notin O\}\wedge T,

and interaction enters through the non-normalized empirical sub-probability measure of survivors,

μtN=1Ni=1NδXtN,i1{t<τN,i},\mu_t^N=\frac1N\sum_{i=1}^N \delta_{X_t^{N,i}}\,1_{\{t<\tau^{N,i}\}},

together with the fraction absorbed,

EtE_t0

In the MFG limit, both EtE_t1 and EtE_t2 retain the history of losses from the game (Campi et al., 2019).

A third class is fully non-Markovian weak-form control. Here the state process is specified on path space EtE_t3, the control is an EtE_t4-valued predictable process, and the controlled law is defined by Girsanov transformation. The coefficients may depend on the full stopped path EtE_t5 and on the time-indexed law flow EtE_t6, allowing interaction through the joint distribution of players’ states and controls (Possamaï et al., 2021).

A fourth class comes from recursive utility. In mean field portfolio games with Epstein-Zin preferences, the game is developed “in a general non-Markovian framework” and “without assuming any Markov structure.” The representative agent’s utility is defined by a BSDE-type recursion with externality EtE_t7, and equilibrium requires

EtE_t8

The non-Markovian feature is not only path dependence of coefficients, but also recursion in utility and conditioning on common noise (Fu et al., 12 May 2025).

These formulations show that “non-Markovian” is not a single modeling choice. It may refer to memory in the state dynamics, memory in the interaction term, a weak formulation on path space, or recursive intertemporal preferences.

2. Time-fractional mean-field games and subdiffusive memory

The time-fractional MFG system studied in “Variational time-fractional Mean Field Games” couples a backward time-fractional HJB equation for the value function EtE_t9 with a forward time-fractional Fokker-Planck equation for the density β(0,1)\beta\in(0,1)0: β(0,1)\beta\in(0,1)1

β(0,1)\beta\in(0,1)2

Here the forward and backward Caputo derivatives are nonlocal in time, since they are convolutions against the kernel β(0,1)\beta\in(0,1)3, and therefore explicitly encode memory (Qing et al., 2018).

The forward Caputo derivative is

β(0,1)\beta\in(0,1)4

and the backward Caputo derivative on β(0,1)\beta\in(0,1)5 is

β(0,1)\beta\in(0,1)6

Because these operators are nonlocal in time, the instantaneous rate of change depends on the entire history. In the limit β(0,1)\beta\in(0,1)7, they recover the classical Markovian derivative, which the paper identifies with the kernel limit β(0,1)\beta\in(0,1)8 and β(0,1)\beta\in(0,1)9 (Qing et al., 2018).

In the quadratic case DtD_t0, the system becomes

DtD_t1

The paper interprets this as an extension of variational mean-field games to the subdiffusive situation, providing an Eulerian interpretation of time-fractional MFG systems (Qing et al., 2018).

Under the assumptions that DtD_t2 is continuous in DtD_t3, satisfies DtD_t4, is increasing in DtD_t5, and the Lasry-Lions-type monotonicity condition

DtD_t6

holds, together with DtD_t7 and DtD_t8, the paper proves existence of a classical solution

DtD_t9

uniqueness under monotonicity, and mass conservation and positivity: Xt=YEt,dYs=v(s,Ys)ds+2dBsX_t=Y_{E_t}, \qquad dY_s=v(s,Y_s)\,ds+\sqrt{2}\,dB_s0 and Xt=YEt,dYs=v(s,Ys)ds+2dBsX_t=Y_{E_t}, \qquad dY_s=v(s,Y_s)\,ds+\sqrt{2}\,dB_s1 for all Xt=YEt,dYs=v(s,Ys)ds+2dBsX_t=Y_{E_t}, \qquad dY_s=v(s,Y_s)\,ds+\sqrt{2}\,dB_s2 (Qing et al., 2018).

The motivating applications listed in the paper are anomalous transport in porous media and biological cells, latency and long-memory in high-frequency trading, and subdiffusive supply-demand dynamics. A plausible implication is that time-fractional MFGs are especially suited to strategic environments where waiting-time heterogeneity is itself part of the aggregate interaction.

3. Path-dependent interactions through absorption and survivor measures

The absorption framework develops a non-Markovian MFG in which players leave the game when their private states hit the boundary of a domain Xt=YEt,dYs=v(s,Ys)ds+2dBsX_t=Y_{E_t}, \qquad dY_s=v(s,Y_s)\,ds+\sqrt{2}\,dB_s3. The controlled representative-player dynamics are

Xt=YEt,dYs=v(s,Ys)ds+2dBsX_t=Y_{E_t}, \qquad dY_s=v(s,Y_s)\,ds+\sqrt{2}\,dB_s4

where Xt=YEt,dYs=v(s,Ys)ds+2dBsX_t=Y_{E_t}, \qquad dY_s=v(s,Y_s)\,ds+\sqrt{2}\,dB_s5 is obtained from

Xt=YEt,dYs=v(s,Ys)ds+2dBsX_t=Y_{E_t}, \qquad dY_s=v(s,Y_s)\,ds+\sqrt{2}\,dB_s6

through the re-parameterization

Xt=YEt,dYs=v(s,Ys)ds+2dBsX_t=Y_{E_t}, \qquad dY_s=v(s,Y_s)\,ds+\sqrt{2}\,dB_s7

and

Xt=YEt,dYs=v(s,Ys)ds+2dBsX_t=Y_{E_t}, \qquad dY_s=v(s,Y_s)\,ds+\sqrt{2}\,dB_s8

This construction makes the interaction depend on the measure of surviving states and therefore on the cumulative history of absorptions (Campi et al., 2019).

The representative-player cost is

Xt=YEt,dYs=v(s,Ys)ds+2dBsX_t=Y_{E_t}, \qquad dY_s=v(s,Y_s)\,ds+\sqrt{2}\,dB_s9

When EtE_t0, the paper states that one sees explicitly the dependence on the whole history EtE_t1 and equivalently on EtE_t2 (Campi et al., 2019).

A strict feedback MFG solution is a pair EtE_t3 such that EtE_t4 minimizes EtE_t5 among all feedbacks and, if EtE_t6 solves the controlled SDE under EtE_t7, then

EtE_t8

for all EtE_t9. The same formulation is extended to relaxed feedbacks m(t,x)m(t,x)0. In fixed-point form, the solution is characterized by

m(t,x)m(t,x)1

or equivalently through a law-operator m(t,x)m(t,x)2 whose fixed points are MFG solutions (Campi et al., 2019).

Under Lipschitz and sub-linear growth of m(t,x)m(t,x)3 in m(t,x)m(t,x)4, continuity in the measure variable relative to m(t,x)m(t,x)5-Wasserstein at Wiener-absolutely-continuous laws, compactness of m(t,x)m(t,x)6, nondegeneracy of m(t,x)m(t,x)7, admissible initial law m(t,x)m(t,x)8, and a convexity assumption m(t,x)m(t,x)9 on the Hamiltonian minimizers, the paper proves existence of both a relaxed feedback solution X=C([0,T];Rd)X=C([0,T];\mathbb{R}^d)0 and, under an extra control-convexity assumption, a strict feedback solution X=C([0,T];Rd)X=C([0,T];\mathbb{R}^d)1 (Campi et al., 2019).

Uniqueness is established under additional assumptions: X=C([0,T];Rd)X=C([0,T];\mathbb{R}^d)2 splits as X=C([0,T];Rd)X=C([0,T];\mathbb{R}^d)3, the Lasry-Lions condition

X=C([0,T];Rd)X=C([0,T];\mathbb{R}^d)4

holds for all X=C([0,T];Rd)X=C([0,T];\mathbb{R}^d)5, the drift is independent of X=C([0,T];Rd)X=C([0,T];\mathbb{R}^d)6, and for each fixed X=C([0,T];Rd)X=C([0,T];\mathbb{R}^d)7 the single-player control problem has a unique minimizer. In that case, any two feedback solutions coincide (Campi et al., 2019).

The same paper links the continuum model to the finite-X=C([0,T];Rd)X=C([0,T];\mathbb{R}^d)8 game. In the finite-dimensional interaction case

X=C([0,T];Rd)X=C([0,T];\mathbb{R}^d)9

a feedback MFG solution induces an τφ=inf{t[0,T]:φ(t)O}T,\tau^\varphi=\inf\{t\in[0,T]:\varphi(t)\notin O\}\wedge T,0-Nash equilibrium for the τφ=inf{t[0,T]:φ(t)O}T,\tau^\varphi=\inf\{t\in[0,T]:\varphi(t)\notin O\}\wedge T,1-player game with τφ=inf{t[0,T]:φ(t)O}T,\tau^\varphi=\inf\{t\in[0,T]:\varphi(t)\notin O\}\wedge T,2 as τφ=inf{t[0,T]:φ(t)O}T,\tau^\varphi=\inf\{t\in[0,T]:\varphi(t)\notin O\}\wedge T,3. No explicit rate is given, although the text notes that in many examples one obtains τφ=inf{t[0,T]:φ(t)O}T,\tau^\varphi=\inf\{t\in[0,T]:\varphi(t)\notin O\}\wedge T,4 by standard law-of-large-numbers arguments (Campi et al., 2019).

4. Weak formulation, McKean-Vlasov BSDEs, and equilibrium characterization

The weak-form theory developed in “Non-asymptotic convergence rates for mean-field games: weak formulation and McKean-Vlasov BSDEs” considers a fully non-Markovian setting with drift control and interaction through the joint distribution of states and controls. Under a reference probability measure τφ=inf{t[0,T]:φ(t)O}T,\tau^\varphi=\inf\{t\in[0,T]:\varphi(t)\notin O\}\wedge T,5, the uncontrolled coordinate process satisfies

τφ=inf{t[0,T]:φ(t)O}T,\tau^\varphi=\inf\{t\in[0,T]:\varphi(t)\notin O\}\wedge T,6

Given a control τφ=inf{t[0,T]:φ(t)O}T,\tau^\varphi=\inf\{t\in[0,T]:\varphi(t)\notin O\}\wedge T,7 and a path-law flow τφ=inf{t[0,T]:φ(t)O}T,\tau^\varphi=\inf\{t\in[0,T]:\varphi(t)\notin O\}\wedge T,8, the controlled measure τφ=inf{t[0,T]:φ(t)O}T,\tau^\varphi=\inf\{t\in[0,T]:\varphi(t)\notin O\}\wedge T,9 is defined by the Girsanov density

μtN=1Ni=1NδXtN,i1{t<τN,i},\mu_t^N=\frac1N\sum_{i=1}^N \delta_{X_t^{N,i}}\,1_{\{t<\tau^{N,i}\}},0

under which

μtN=1Ni=1NδXtN,i1{t<τN,i},\mu_t^N=\frac1N\sum_{i=1}^N \delta_{X_t^{N,i}}\,1_{\{t<\tau^{N,i}\}},1

The drift is bounded, Lipschitz in path, measure, and control, and dissipative in the path variable: μtN=1Ni=1NδXtN,i1{t<τN,i},\mu_t^N=\frac1N\sum_{i=1}^N \delta_{X_t^{N,i}}\,1_{\{t<\tau^{N,i}\}},2 The reward data μtN=1Ni=1NδXtN,i1{t<τN,i},\mu_t^N=\frac1N\sum_{i=1}^N \delta_{X_t^{N,i}}\,1_{\{t<\tau^{N,i}\}},3 and μtN=1Ni=1NδXtN,i1{t<τN,i},\mu_t^N=\frac1N\sum_{i=1}^N \delta_{X_t^{N,i}}\,1_{\{t<\tau^{N,i}\}},4 are Borel, polynomial-growth, and Lipschitz in all arguments (Possamaï et al., 2021).

For a pair μtN=1Ni=1NδXtN,i1{t<τN,i},\mu_t^N=\frac1N\sum_{i=1}^N \delta_{X_t^{N,i}}\,1_{\{t<\tau^{N,i}\}},5, the objective is

μtN=1Ni=1NδXtN,i1{t<τN,i},\mu_t^N=\frac1N\sum_{i=1}^N \delta_{X_t^{N,i}}\,1_{\{t<\tau^{N,i}\}},6

and a mean-field equilibrium is defined by optimality of μtN=1Ni=1NδXtN,i1{t<τN,i},\mu_t^N=\frac1N\sum_{i=1}^N \delta_{X_t^{N,i}}\,1_{\{t<\tau^{N,i}\}},7 under μtN=1Ni=1NδXtN,i1{t<τN,i},\mu_t^N=\frac1N\sum_{i=1}^N \delta_{X_t^{N,i}}\,1_{\{t<\tau^{N,i}\}},8 together with the consistency condition

μtN=1Ni=1NδXtN,i1{t<τN,i},\mu_t^N=\frac1N\sum_{i=1}^N \delta_{X_t^{N,i}}\,1_{\{t<\tau^{N,i}\}},9

The paper then introduces the Hamiltonian

EtE_t00

with measurable selection EtE_t01, and proves that EtE_t02 is a mean-field equilibrium if and only if there exist processes EtE_t03 solving a generalized McKean-Vlasov BSDE, with

EtE_t04

Moreover EtE_t05 (Possamaï et al., 2021).

This BSDE characterization is the central replacement for the Markovian HJB/master-equation route. Under the Lipschitz and dissipativity assumptions on EtE_t06, and either a small terminal payoff EtE_t07 or a smooth terminal payoff in the measure argument, the BSDE admits a unique solution EtE_t08 in

EtE_t09

The proof combines change of measure, a fixed-point argument on EtE_t10, contraction in a weighted norm, and the use of dissipativity to obtain global-in-EtE_t11 bounds (Possamaï et al., 2021).

A common misconception is that well-posedness and uniqueness in mean-field games necessarily require short time horizon, separability assumptions, or Lasry-Lions monotonicity. In this weak non-Markovian setting, the paper explicitly states that its existence and uniqueness results “do not require short time horizon, separability assumptions on the coefficients, nor Lasry and Lions's monotonicity conditions, but rather smallness, or alternatively regularity, conditions on the terminal reward and a dissipativity condition on the drift” (Possamaï et al., 2021).

5. Recursive utilities and non-Markovian portfolio games

The portfolio-game model with Epstein-Zin preferences provides a different non-Markovian mechanism, centered on recursive utility rather than on path-dependent state dynamics alone. On a filtered space carrying common noise EtE_t12 and idiosyncratic noise EtE_t13, the wealth dynamics of a representative agent are

EtE_t14

with EtE_t15, bounded progressive coefficients, and EtE_t16 (Fu et al., 12 May 2025).

Given an externality EtE_t17, the Epstein-Zin utility satisfies the recursion

EtE_t18

where

EtE_t19

EtE_t20

EtE_t21

In the MFG limit, equilibrium requires the externality to coincide with the conditional log-average of optimal consumption and terminal wealth: EtE_t22 The paper establishes a uniqueness result by proving a one-to-one correspondence between Nash equilibria and the solutions to a class of BSDEs (Fu et al., 12 May 2025).

The core BSDE is

EtE_t23

with the driver

EtE_t24

The corresponding equilibrium controls are

EtE_t25

A necessary stochastic maximum principle tailored to Epstein-Zin utility and a nonlinear transformation are the key ingredients in obtaining this characterization (Fu et al., 12 May 2025).

The stochastic maximum principle introduces adjoint processes EtE_t26 and EtE_t27, with Hamiltonian

EtE_t28

and first-order conditions

EtE_t29

The nonlinear transformation

EtE_t30

then converts the adjoint system into the BSDE for equilibrium. According to the paper, this “log-ratio” transform is the key link between the SMP adjoints and the final BSDE characterization of the MFG Nash equilibrium (Fu et al., 12 May 2025).

In the deterministic case, where EtE_t31 depend only on EtE_t32, the BSDE reduces to a deterministic ODE and then to a Riccati equation, yielding an explicit closed-form solution for the equilibrium investment and consumption policies (Fu et al., 12 May 2025).

6. Solution concepts, convergence, and analytical themes

Across the supplied works, non-Markovian MFGs are solved through several distinct but related analytical mechanisms.

Framework Equilibrium device Principal well-posedness or limit statement
Time-fractional subdiffusion (Qing et al., 2018) Variational formulation via convex duality Existence of a classical solution; uniqueness under monotonicity
Past absorptions (Campi et al., 2019) Fixed point in feedback or relaxed feedback form Existence of relaxed and strict feedback solutions; approximate Nash equilibria with EtE_t33
Weak path-dependent MFG (Possamaï et al., 2021) Generalised McKean-Vlasov BSDE Unique solution in EtE_t34 under dissipativity and smallness or smoothness
Epstein-Zin portfolio game (Fu et al., 12 May 2025) BSDE plus stochastic maximum principle One-to-one correspondence between Nash equilibria and a class of BSDEs; uniqueness result

In the time-fractional model, the coupled HJB-FP system is the Euler-Lagrange system of two dual convex problems. The primal functional EtE_t35 is defined on EtE_t36 with EtE_t37, the dual functional EtE_t38 is defined on EtE_t39 satisfying the fractional-FP constraint, and Fenchel-Rockafellar duality yields

EtE_t40

Taking first variations recovers precisely the HJB and FP equations with the correct fractional-time operators (Qing et al., 2018).

In the absorption model, existence proceeds by truncating drift and cost, applying the bounded-data existence result of Campi-Delarue via Kakutani’s theorem, passing to the limit using tightness in EtE_t41 and a martingale-problem characterization, and finally using a mimicking theorem of Brunick-Shreve to convert the limit into feedback form (Campi et al., 2019).

In the weak non-Markovian path-space formulation, the equilibrium BSDE is used both for well-posedness and for limit theory. The paper proves non-asymptotic convergence estimates for the EtE_t42-player game: EtE_t43 and

EtE_t44

Here EtE_t45 measures the control-interaction correction and

EtE_t46

In the state-dependent-law case, the paper states that one can insert classical rates such as EtE_t47, leading overall to error of order EtE_t48 in dimension EtE_t49 (Possamaï et al., 2021).

The same weak-form program also treats closed-loop equilibria and, in the Markovian case, the master equation. When a smooth master solution exists, the paper recovers

EtE_t50

and establishes

EtE_t51

which yields convergence of the finite-EtE_t52 HJB system to the master equation (Possamaï et al., 2021).

A plausible implication of these results is that non-Markovianity shifts the analytic emphasis from PDE structure alone toward convex duality, weak formulations, stochastic maximum principles, and BSDE fixed points.

7. Conceptual distinctions, misconceptions, and research directions

One recurring distinction is between memory in dynamics and memory in interaction. In the time-fractional model, memory is generated at the microscopic level by a subdiffusion process EtE_t53, and the macroscopic equations inherit Caputo derivatives (Qing et al., 2018). In the absorption model, the state dynamics are standard diffusions but the mean-field interaction depends on survivor measures and the accumulated fraction absorbed, which is path-dependent through EtE_t54 and EtE_t55 (Campi et al., 2019). In the weak formulation and Epstein-Zin portfolio game, non-Markovianity is instead encoded through full path dependence of coefficients, law dependence on path space, or recursive utility characterized by BSDEs (Possamaï et al., 2021, Fu et al., 12 May 2025).

A second conceptual point concerns the role of monotonicity. In the time-fractional and absorption settings, uniqueness is obtained under monotonicity hypotheses of Lasry-Lions type (Qing et al., 2018, Campi et al., 2019). By contrast, the weak-form BSDE framework explicitly provides existence and uniqueness results that do not require Lasry-Lions monotonicity, replacing it with dissipativity of the drift and smallness or regularity assumptions on the terminal reward (Possamaï et al., 2021). This does not eliminate monotonicity from the subject; rather, it identifies an alternative route to well-posedness in non-Markovian environments.

A third misconception is that abandoning Markov structure necessarily makes equilibrium characterization intractable. The supplied papers provide four counterexamples. Time-fractional systems preserve a variational Euler-Lagrange structure (Qing et al., 2018). Absorption models admit strict and relaxed feedback fixed-point formulations and induce approximate Nash equilibria for large finite games (Campi et al., 2019). Weak path-dependent games are characterized by generalized McKean-Vlasov BSDEs and support non-asymptotic convergence rates (Possamaï et al., 2021). Recursive-utility portfolio games admit a one-to-one correspondence between Nash equilibria and BSDE solutions, and even explicit closed forms in the deterministic case (Fu et al., 12 May 2025).

The available research directions stated explicitly in the supplied material also point to broader generality. The Epstein-Zin study states that its combination of stochastic maximum principle, martingale-optimality principle, nonlinear adjoint-to-value transformation, and BSDE fixed-point arguments in BMO spaces can be exported to forward-utility MFGs, rank-dependent or habit-formation utilities, partial-information or jump-diffusion environments, and state-constraint or impulse-control MFGs (Fu et al., 12 May 2025). This suggests that non-Markovian mean-field games are less a narrow subclass than a collection of methodologies for equilibrium analysis beyond the classical dynamic programming paradigm.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Non-Markovian Mean-Field Games.