- The paper develops a rigorous framework to approximate transfer operators for mean-field SDEs using a decoupling strategy and Galerkin projection.
- It adapts the EDMD algorithm with Monte Carlo estimation to obtain almost sure convergence in decoupled simulations, ensuring accurate spectral analysis.
- Numerical experiments on Cormier and Kuramoto models validate the method’s effectiveness in identifying invariant measures and metastable dynamics.
Data-driven Approximation of Transfer Operators for Mean-field Stochastic Differential Equations
Introduction and Motivation
This paper develops a rigorous framework for the data-driven approximation of transfer operators—specifically, the Perron–Frobenius and Koopman operators—in the context of mean-field stochastic differential equations (SDEs), also known as McKean–Vlasov equations. These equations arise as the mean-field limit of interacting particle systems with symmetric interactions and are central to modeling phenomena in physics, biology, economics, and social sciences. The global dynamical properties of such systems, including invariant distributions, metastable sets, and transition rates, are encoded in the spectral properties of associated transfer operators. However, the nonlinearity inherent in McKean–Vlasov dynamics complicates the direct application of classical transfer operator theory and data-driven methods such as Extended Dynamic Mode Decomposition (EDMD).
Decoupled McKean–Vlasov SDEs and Transfer Operator Construction
A key technical innovation is the use of decoupled McKean–Vlasov SDEs, in which the law μt is treated as an external parameter rather than being coupled to the process. This decoupling restores the Markov family property and linearity, enabling the definition of well-posed transfer operators. The paper formalizes the construction of the Koopman and Perron–Frobenius operators for these decoupled SDEs, providing explicit integral representations in terms of the transition kernel p(μ0,0,t,x,y).
The infinitesimal generators of these operators are derived, corresponding to the backward Kolmogorov and Fokker–Planck equations, respectively, with coefficients parameterized by the time-dependent law μt. This approach circumvents the nonlinearity and ill-posedness issues that arise when attempting to define transfer operators for the original McKean–Vlasov SDEs.
Galerkin Projection and Data-driven Estimation
To enable numerical computation, the infinite-dimensional transfer operators are projected onto finite-dimensional subspaces via Galerkin projection, using a dictionary of basis functions {ψ1,…,ψN}. The projected operators are represented by matrices whose entries are expectations involving the basis functions and the transition kernel. Since these expectations are generally intractable, the paper employs Monte Carlo estimation using simulated trajectories of the decoupled SDE.
The EDMD algorithm is adapted to this setting, with the Gram and structure matrices estimated from data. Theoretical analysis establishes almost sure and L2 convergence of the data-driven matrices to their exact counterparts as the number of samples M→∞ and the time discretization h→0, under suitable regularity and independence assumptions. The convergence proofs leverage the strong law of large numbers and stability estimates for the numerical schemes.
A notable result is that almost sure convergence is only guaranteed when using data from decoupled SDE simulations; when using data from the original interacting particle system, only L2 convergence is obtained due to the lack of independence among trajectories.
Numerical Schemes and Error Analysis
The paper provides a detailed analysis of the numerical schemes for simulating decoupled McKean–Vlasov SDEs. The Euler–Maruyama method is employed, with the law μt estimated from simulations of the interacting particle system. Error bounds are derived for the approximation of μt in terms of the Wasserstein distance, with explicit rates depending on the dimension d and the number of particles M~. The propagation of chaos results and mean-square convergence theorems are invoked to justify the accuracy of the empirical law approximation.
Theoretical results guarantee that the error in the data-driven transfer operator estimates can be controlled by increasing the number of particles and decreasing the time step, with explicit bounds provided.
Spectral Analysis and Benchmark Applications
The methodology is validated on several benchmark models:
- Cormier Model: The approach correctly identifies multiple invariant distributions and their stability, as well as the expected Hermite polynomial structure of Koopman eigenfunctions and the exponential decay of eigenvalues.
- Kuramoto Model on the Circle: The method recovers the unique invariant distribution and reveals metastable sets corresponding to angular regions, with eigenvalues and eigenfunctions matching theoretical predictions.
- Kuramoto Model on the Sphere: The framework is extended to high-dimensional, constrained dynamics, demonstrating the identification of metastable hemispheres and concentration of invariant measures at the poles.
In all cases, the EDMD-based spectral analysis yields eigenvalues and eigenfunctions that accurately reflect the slow dynamics and metastable structures of the underlying mean-field systems. The numerical results confirm the theoretical convergence and demonstrate the versatility of the approach.
Implications and Future Directions
This work provides a principled foundation for the data-driven analysis of mean-field stochastic systems via transfer operator methods. The decoupling strategy and convergence analysis address longstanding challenges in the application of operator-theoretic techniques to nonlinear, measure-dependent dynamics. The framework enables the identification of global dynamical features from simulation data, with rigorous error control.
Potential extensions include the use of kernel methods and deep learning architectures for basis function selection, the application of generator-based EDMD (gEDMD), and the analysis of clustering and transition rates in high-dimensional systems. Further investigation into the impact of dictionary size and time discretization on spectral accuracy is warranted. The methodology is broadly applicable to the paper of metastability, rare events, and coarse-grained dynamics in complex stochastic systems.
Conclusion
The paper establishes a robust, theoretically justified approach for the data-driven approximation of transfer operators in mean-field SDEs, leveraging decoupled dynamics, Galerkin projection, and EDMD. The convergence results and numerical experiments demonstrate the efficacy of the method in extracting global dynamical information from simulation data. This work lays the groundwork for future research in operator-theoretic analysis of stochastic systems with mean-field interactions, with significant implications for modeling, simulation, and understanding of complex phenomena across scientific domains.