2000 character limit reached
Dual Ensemble Kalman Filter for Stochastic Optimal Control (2404.06696v2)
Published 10 Apr 2024 in eess.SY and cs.SY
Abstract: In this paper, stochastic optimal control problems in continuous time and space are considered. In recent years, such problems have received renewed attention from the lens of reinforcement learning (RL) which is also one of our motivation. The main contribution is a simulation-based algorithm -- dual ensemble Kalman filter (EnKF) -- to numerically approximate the solution of these problems. The paper extends our previous work where the dual EnKF was applied in deterministic settings of the problem. The theoretical results and algorithms are illustrated with numerical experiments.
- A. A. Joshi, A. Taghvaei, P. G. Mehta, and S. P. Meyn, “Controlled interacting particle algorithms for simulation-based reinforcement learning,” Systems & Control Letters, vol. 170, p. 105392, 2022. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0167691122001694
- W. H. Fleming and S. K. Mitter, “Optimal Control and Nonlinear Filtering for Nondegenerate Diffusion Processes,” Stochastics, vol. 8, no. 1, pp. 63–77, January 1982. [Online]. Available: https://doi.org/10.1080/17442508208833228
- O. B. Hijab, “Minimum energy estimation,” Ph.D. dissertation, University of California, Berkeley, 1980.
- R. E. Mortensen, “Maximum-likelihood recursive nonlinear filtering,” Journal of Optimization Theory and Applications, vol. 2, no. 6, pp. 386–394, Nov 1968. [Online]. Available: https://doi.org/10.1007/BF00925744
- E. Todorov, “Linearly-solvable markov decision problems,” in Advances in Neural Information Processing Systems, B. Schölkopf, J. Platt, and T. Hoffman, Eds., vol. 19. MIT Press, 2007. [Online]. Available: https://proceedings.neurips.cc/paper/2006/file/d806ca13ca3449af72a1ea5aedbed26a-Paper.pdf
- H. J. Kappen, “Linear theory for control of nonlinear stochastic systems,” Phys. Rev. Lett., vol. 95, p. 200201, Nov 2005. [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevLett.95.200201
- ——, “Path integrals and symmetry breaking for optimal control theory,” Journal of Statistical Mechanics: Theory and Experiment, vol. 2005, no. 11, pp. P11 011–P11 011, nov 2005. [Online]. Available: https://doi.org/10.1088/1742-5468/2005/11/p11011
- S. Vijayakumar, K. Rawlik, and M. Toussaint, “On stochastic optimal control and reinforcement learning by approximate inference,” in Robotics: Science and Systems VIII, N. Roy, P. Newman, and S. Srinivasa, Eds., 2013, pp. 353–360.
- M. Toussaint, “Robot trajectory optimization using approximate inference,” in Proceedings of the 26th Annual International Conference on Machine Learning, ser. ICML ’09. New York, NY, USA: Association for Computing Machinery, 2009, p. 1049–1056. [Online]. Available: https://doi.org/10.1145/1553374.1553508
- C. Hoffmann and P. Rostalski, “Linear optimal control on factor graphs — a message passing perspective —,” IFAC-PapersOnLine, vol. 50, no. 1, pp. 6314–6319, 2017, 20th IFAC World Congress. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S2405896317313800
- S. Levine, “Reinforcement learning and control as probabilistic inference: Tutorial and review,” 2018.
- D. Maoutsa and M. Opper, “Deterministic particle flows for constraining stochastic nonlinear systems,” Phys. Rev. Res., vol. 4, p. 043035, Oct 2022. [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevResearch.4.043035
- S. Reich, “Particle-based algorithm for stochastic optimal control,” 2024.
- D. Jacobson, “Optimal stochastic linear systems with exponential performance criteria and their relation to deterministic differential games,” IEEE Transactions on Automatic Control, vol. 18, no. 2, pp. 124–131, 1973.
- J. Doyle, “Guaranteed margins for lqg regulators,” IEEE Transactions on Automatic Control, vol. 23, no. 4, pp. 756–757, 1978.
- P. Whittle, “Risk-sensitive linear/quadratic/gaussian control,” Advances in Applied Probability, vol. 13, no. 4, pp. 764–777, 1981. [Online]. Available: http://www.jstor.org/stable/1426972
- W. H. Fleming, “Risk sensitive stochastic control and differential games,” Communications in Information & Systems, vol. 6, no. 3, pp. 161 – 177, 2006.
- P. Whittle, “Risk sensitivity, a strangely pervasive concept,” Macroeconomic Dynamics, vol. 6, no. 1, p. 5–18, 2002.
- T. Başar, “Robust designs through risk sensitivity: An overview,” Journal of Systems Science and Complexity, vol. 34, no. 5, pp. 1634–1665, Oct 2021. [Online]. Available: https://doi.org/10.1007/s11424-021-1242-6
- A. Biswas and V. S. Borkar, “Ergodic risk-sensitive control—A survey,” Annual Reviews in Control, vol. 55, pp. 118–141, 2023. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1367578823000068
- A. Lim and X. Y. Zhou, “A maximum principle for risk-sensitive control,” in 42nd IEEE International Conference on Decision and Control, vol. 6, 2003, pp. 5819–5824 Vol.6.
- T. E. Duncan, “Linear-exponential-quadratic gaussian control,” IEEE Transactions on Automatic Control, vol. 58, no. 11, pp. 2910–2911, 2013.
- M. R. James, “Asymptotic analysis of nonlinear stochastic risk-sensitive control and differential games,” Mathematics of Control, Signals and Systems, vol. 5, no. 4, pp. 401–417, Dec 1992. [Online]. Available: https://doi.org/10.1007/BF02134013
- D. Nualart and É. Pardoux, “Stochastic calculus with anticipating integrands,” Probability Theory and Related Fields, vol. 78, no. 4, pp. 535–581, 1988.
- T. Yang, R. Laugesen, P. Mehta, and S. Meyn, “Multivariable feedback particle filter,” Automatica, vol. 71, pp. 10–23, 2016. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S000510981630142X
- D. Maoutsa, S. Reich, and M. Opper, “Interacting particle solutions of fokker–planck equations through gradient–log–density estimation,” Entropy, vol. 22, no. 8, 2020. [Online]. Available: https://www.mdpi.com/1099-4300/22/8/802
- J. W. Kim and P. G. Mehta, “An optimal control derivation of nonlinear smoothing equations,” in Advances in Dynamics, Optimization and Computation, O. Junge, O. Schütze, G. Froyland, S. Ober-Blöbaum, and K. Padberg-Gehle, Eds. Cham: Springer International Publishing, 2020, pp. 295–311.