Non-Markovian MFGs: Memory and Dynamics
- Non-Markovian mean-field games are models where agents’ decisions depend on both current states and historical dynamics, integrating memory effects into strategic interactions.
- These frameworks employ techniques such as time-fractional derivatives, path-dependent survivor measures, and generalized BSDEs to capture non-Markovian behavior.
- They offer robust approaches for analyzing Nash equilibria in complex systems, including anomalous diffusion, absorption phenomena, and recursive utility settings.
Non-Markovian mean-field games are mean-field game models in which the representative agent’s optimization problem or the consistency condition depends on temporal history rather than only on the current state. In the supplied literature, non-Markovianity appears in several distinct forms: subdiffusive dynamics generated by an inverse stable subordinator and encoded by Caputo time-fractional derivatives (Qing et al., 2018); path-dependent interactions through the empirical sub-probability measure of survivors and the history of absorptions (Campi et al., 2019); weak-form games with coefficients depending on the joint distribution of states and controls on path space (Possamaï et al., 2021); and recursive utility portfolio games with Epstein-Zin preferences formulated without any Markov structure and characterized by BSDEs (Fu et al., 12 May 2025). Across these formulations, the common objective is to define Nash equilibria in a continuum of strategically interacting agents when memory, delayed effects, absorption history, or non-separable intertemporal preferences invalidate the classical dynamic programming paradigm.
1. Core formulations of non-Markovianity
A first class of non-Markovian mean-field games arises from anomalous diffusion. In the time-fractional formulation, the microscopic state process is obtained by time-changing a diffusion with the inverse of a strictly increasing -stable subordinator . The resulting trajectory
is continuous, non-Markovian, non-Gaussian, and carries a long-term memory because “pauses” the motion for random heavy-tailed times (Qing et al., 2018). In this setting, the law satisfies a time-fractional Fokker-Planck equation, and equilibrium is described by a coupled time-fractional HJB-FP system.
A second class is path dependence through absorption. In the framework of games with smooth dependence on past absorptions, each player’s trajectory lives in , with absorption time
and interaction enters through the non-normalized empirical sub-probability measure of survivors,
together with the fraction absorbed,
0
In the MFG limit, both 1 and 2 retain the history of losses from the game (Campi et al., 2019).
A third class is fully non-Markovian weak-form control. Here the state process is specified on path space 3, the control is an 4-valued predictable process, and the controlled law is defined by Girsanov transformation. The coefficients may depend on the full stopped path 5 and on the time-indexed law flow 6, allowing interaction through the joint distribution of players’ states and controls (Possamaï et al., 2021).
A fourth class comes from recursive utility. In mean field portfolio games with Epstein-Zin preferences, the game is developed “in a general non-Markovian framework” and “without assuming any Markov structure.” The representative agent’s utility is defined by a BSDE-type recursion with externality 7, and equilibrium requires
8
The non-Markovian feature is not only path dependence of coefficients, but also recursion in utility and conditioning on common noise (Fu et al., 12 May 2025).
These formulations show that “non-Markovian” is not a single modeling choice. It may refer to memory in the state dynamics, memory in the interaction term, a weak formulation on path space, or recursive intertemporal preferences.
2. Time-fractional mean-field games and subdiffusive memory
The time-fractional MFG system studied in “Variational time-fractional Mean Field Games” couples a backward time-fractional HJB equation for the value function 9 with a forward time-fractional Fokker-Planck equation for the density 0: 1
2
Here the forward and backward Caputo derivatives are nonlocal in time, since they are convolutions against the kernel 3, and therefore explicitly encode memory (Qing et al., 2018).
The forward Caputo derivative is
4
and the backward Caputo derivative on 5 is
6
Because these operators are nonlocal in time, the instantaneous rate of change depends on the entire history. In the limit 7, they recover the classical Markovian derivative, which the paper identifies with the kernel limit 8 and 9 (Qing et al., 2018).
In the quadratic case 0, the system becomes
1
The paper interprets this as an extension of variational mean-field games to the subdiffusive situation, providing an Eulerian interpretation of time-fractional MFG systems (Qing et al., 2018).
Under the assumptions that 2 is continuous in 3, satisfies 4, is increasing in 5, and the Lasry-Lions-type monotonicity condition
6
holds, together with 7 and 8, the paper proves existence of a classical solution
9
uniqueness under monotonicity, and mass conservation and positivity: 0 and 1 for all 2 (Qing et al., 2018).
The motivating applications listed in the paper are anomalous transport in porous media and biological cells, latency and long-memory in high-frequency trading, and subdiffusive supply-demand dynamics. A plausible implication is that time-fractional MFGs are especially suited to strategic environments where waiting-time heterogeneity is itself part of the aggregate interaction.
3. Path-dependent interactions through absorption and survivor measures
The absorption framework develops a non-Markovian MFG in which players leave the game when their private states hit the boundary of a domain 3. The controlled representative-player dynamics are
4
where 5 is obtained from
6
through the re-parameterization
7
and
8
This construction makes the interaction depend on the measure of surviving states and therefore on the cumulative history of absorptions (Campi et al., 2019).
The representative-player cost is
9
When 0, the paper states that one sees explicitly the dependence on the whole history 1 and equivalently on 2 (Campi et al., 2019).
A strict feedback MFG solution is a pair 3 such that 4 minimizes 5 among all feedbacks and, if 6 solves the controlled SDE under 7, then
8
for all 9. The same formulation is extended to relaxed feedbacks 0. In fixed-point form, the solution is characterized by
1
or equivalently through a law-operator 2 whose fixed points are MFG solutions (Campi et al., 2019).
Under Lipschitz and sub-linear growth of 3 in 4, continuity in the measure variable relative to 5-Wasserstein at Wiener-absolutely-continuous laws, compactness of 6, nondegeneracy of 7, admissible initial law 8, and a convexity assumption 9 on the Hamiltonian minimizers, the paper proves existence of both a relaxed feedback solution 0 and, under an extra control-convexity assumption, a strict feedback solution 1 (Campi et al., 2019).
Uniqueness is established under additional assumptions: 2 splits as 3, the Lasry-Lions condition
4
holds for all 5, the drift is independent of 6, and for each fixed 7 the single-player control problem has a unique minimizer. In that case, any two feedback solutions coincide (Campi et al., 2019).
The same paper links the continuum model to the finite-8 game. In the finite-dimensional interaction case
9
a feedback MFG solution induces an 0-Nash equilibrium for the 1-player game with 2 as 3. No explicit rate is given, although the text notes that in many examples one obtains 4 by standard law-of-large-numbers arguments (Campi et al., 2019).
4. Weak formulation, McKean-Vlasov BSDEs, and equilibrium characterization
The weak-form theory developed in “Non-asymptotic convergence rates for mean-field games: weak formulation and McKean-Vlasov BSDEs” considers a fully non-Markovian setting with drift control and interaction through the joint distribution of states and controls. Under a reference probability measure 5, the uncontrolled coordinate process satisfies
6
Given a control 7 and a path-law flow 8, the controlled measure 9 is defined by the Girsanov density
0
under which
1
The drift is bounded, Lipschitz in path, measure, and control, and dissipative in the path variable: 2 The reward data 3 and 4 are Borel, polynomial-growth, and Lipschitz in all arguments (Possamaï et al., 2021).
For a pair 5, the objective is
6
and a mean-field equilibrium is defined by optimality of 7 under 8 together with the consistency condition
9
The paper then introduces the Hamiltonian
00
with measurable selection 01, and proves that 02 is a mean-field equilibrium if and only if there exist processes 03 solving a generalized McKean-Vlasov BSDE, with
04
Moreover 05 (Possamaï et al., 2021).
This BSDE characterization is the central replacement for the Markovian HJB/master-equation route. Under the Lipschitz and dissipativity assumptions on 06, and either a small terminal payoff 07 or a smooth terminal payoff in the measure argument, the BSDE admits a unique solution 08 in
09
The proof combines change of measure, a fixed-point argument on 10, contraction in a weighted norm, and the use of dissipativity to obtain global-in-11 bounds (Possamaï et al., 2021).
A common misconception is that well-posedness and uniqueness in mean-field games necessarily require short time horizon, separability assumptions, or Lasry-Lions monotonicity. In this weak non-Markovian setting, the paper explicitly states that its existence and uniqueness results “do not require short time horizon, separability assumptions on the coefficients, nor Lasry and Lions's monotonicity conditions, but rather smallness, or alternatively regularity, conditions on the terminal reward and a dissipativity condition on the drift” (Possamaï et al., 2021).
5. Recursive utilities and non-Markovian portfolio games
The portfolio-game model with Epstein-Zin preferences provides a different non-Markovian mechanism, centered on recursive utility rather than on path-dependent state dynamics alone. On a filtered space carrying common noise 12 and idiosyncratic noise 13, the wealth dynamics of a representative agent are
14
with 15, bounded progressive coefficients, and 16 (Fu et al., 12 May 2025).
Given an externality 17, the Epstein-Zin utility satisfies the recursion
18
where
19
20
21
In the MFG limit, equilibrium requires the externality to coincide with the conditional log-average of optimal consumption and terminal wealth: 22 The paper establishes a uniqueness result by proving a one-to-one correspondence between Nash equilibria and the solutions to a class of BSDEs (Fu et al., 12 May 2025).
The core BSDE is
23
with the driver
24
The corresponding equilibrium controls are
25
A necessary stochastic maximum principle tailored to Epstein-Zin utility and a nonlinear transformation are the key ingredients in obtaining this characterization (Fu et al., 12 May 2025).
The stochastic maximum principle introduces adjoint processes 26 and 27, with Hamiltonian
28
and first-order conditions
29
The nonlinear transformation
30
then converts the adjoint system into the BSDE for equilibrium. According to the paper, this “log-ratio” transform is the key link between the SMP adjoints and the final BSDE characterization of the MFG Nash equilibrium (Fu et al., 12 May 2025).
In the deterministic case, where 31 depend only on 32, the BSDE reduces to a deterministic ODE and then to a Riccati equation, yielding an explicit closed-form solution for the equilibrium investment and consumption policies (Fu et al., 12 May 2025).
6. Solution concepts, convergence, and analytical themes
Across the supplied works, non-Markovian MFGs are solved through several distinct but related analytical mechanisms.
| Framework | Equilibrium device | Principal well-posedness or limit statement |
|---|---|---|
| Time-fractional subdiffusion (Qing et al., 2018) | Variational formulation via convex duality | Existence of a classical solution; uniqueness under monotonicity |
| Past absorptions (Campi et al., 2019) | Fixed point in feedback or relaxed feedback form | Existence of relaxed and strict feedback solutions; approximate Nash equilibria with 33 |
| Weak path-dependent MFG (Possamaï et al., 2021) | Generalised McKean-Vlasov BSDE | Unique solution in 34 under dissipativity and smallness or smoothness |
| Epstein-Zin portfolio game (Fu et al., 12 May 2025) | BSDE plus stochastic maximum principle | One-to-one correspondence between Nash equilibria and a class of BSDEs; uniqueness result |
In the time-fractional model, the coupled HJB-FP system is the Euler-Lagrange system of two dual convex problems. The primal functional 35 is defined on 36 with 37, the dual functional 38 is defined on 39 satisfying the fractional-FP constraint, and Fenchel-Rockafellar duality yields
40
Taking first variations recovers precisely the HJB and FP equations with the correct fractional-time operators (Qing et al., 2018).
In the absorption model, existence proceeds by truncating drift and cost, applying the bounded-data existence result of Campi-Delarue via Kakutani’s theorem, passing to the limit using tightness in 41 and a martingale-problem characterization, and finally using a mimicking theorem of Brunick-Shreve to convert the limit into feedback form (Campi et al., 2019).
In the weak non-Markovian path-space formulation, the equilibrium BSDE is used both for well-posedness and for limit theory. The paper proves non-asymptotic convergence estimates for the 42-player game: 43 and
44
Here 45 measures the control-interaction correction and
46
In the state-dependent-law case, the paper states that one can insert classical rates such as 47, leading overall to error of order 48 in dimension 49 (Possamaï et al., 2021).
The same weak-form program also treats closed-loop equilibria and, in the Markovian case, the master equation. When a smooth master solution exists, the paper recovers
50
and establishes
51
which yields convergence of the finite-52 HJB system to the master equation (Possamaï et al., 2021).
A plausible implication of these results is that non-Markovianity shifts the analytic emphasis from PDE structure alone toward convex duality, weak formulations, stochastic maximum principles, and BSDE fixed points.
7. Conceptual distinctions, misconceptions, and research directions
One recurring distinction is between memory in dynamics and memory in interaction. In the time-fractional model, memory is generated at the microscopic level by a subdiffusion process 53, and the macroscopic equations inherit Caputo derivatives (Qing et al., 2018). In the absorption model, the state dynamics are standard diffusions but the mean-field interaction depends on survivor measures and the accumulated fraction absorbed, which is path-dependent through 54 and 55 (Campi et al., 2019). In the weak formulation and Epstein-Zin portfolio game, non-Markovianity is instead encoded through full path dependence of coefficients, law dependence on path space, or recursive utility characterized by BSDEs (Possamaï et al., 2021, Fu et al., 12 May 2025).
A second conceptual point concerns the role of monotonicity. In the time-fractional and absorption settings, uniqueness is obtained under monotonicity hypotheses of Lasry-Lions type (Qing et al., 2018, Campi et al., 2019). By contrast, the weak-form BSDE framework explicitly provides existence and uniqueness results that do not require Lasry-Lions monotonicity, replacing it with dissipativity of the drift and smallness or regularity assumptions on the terminal reward (Possamaï et al., 2021). This does not eliminate monotonicity from the subject; rather, it identifies an alternative route to well-posedness in non-Markovian environments.
A third misconception is that abandoning Markov structure necessarily makes equilibrium characterization intractable. The supplied papers provide four counterexamples. Time-fractional systems preserve a variational Euler-Lagrange structure (Qing et al., 2018). Absorption models admit strict and relaxed feedback fixed-point formulations and induce approximate Nash equilibria for large finite games (Campi et al., 2019). Weak path-dependent games are characterized by generalized McKean-Vlasov BSDEs and support non-asymptotic convergence rates (Possamaï et al., 2021). Recursive-utility portfolio games admit a one-to-one correspondence between Nash equilibria and BSDE solutions, and even explicit closed forms in the deterministic case (Fu et al., 12 May 2025).
The available research directions stated explicitly in the supplied material also point to broader generality. The Epstein-Zin study states that its combination of stochastic maximum principle, martingale-optimality principle, nonlinear adjoint-to-value transformation, and BSDE fixed-point arguments in BMO spaces can be exported to forward-utility MFGs, rank-dependent or habit-formation utilities, partial-information or jump-diffusion environments, and state-constraint or impulse-control MFGs (Fu et al., 12 May 2025). This suggests that non-Markovian mean-field games are less a narrow subclass than a collection of methodologies for equilibrium analysis beyond the classical dynamic programming paradigm.