Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
158 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Dual Ensemble Kalman Filter for Stochastic Optimal Control (2404.06696v2)

Published 10 Apr 2024 in eess.SY and cs.SY

Abstract: In this paper, stochastic optimal control problems in continuous time and space are considered. In recent years, such problems have received renewed attention from the lens of reinforcement learning (RL) which is also one of our motivation. The main contribution is a simulation-based algorithm -- dual ensemble Kalman filter (EnKF) -- to numerically approximate the solution of these problems. The paper extends our previous work where the dual EnKF was applied in deterministic settings of the problem. The theoretical results and algorithms are illustrated with numerical experiments.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (27)
  1. A. A. Joshi, A. Taghvaei, P. G. Mehta, and S. P. Meyn, “Controlled interacting particle algorithms for simulation-based reinforcement learning,” Systems & Control Letters, vol. 170, p. 105392, 2022. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0167691122001694
  2. W. H. Fleming and S. K. Mitter, “Optimal Control and Nonlinear Filtering for Nondegenerate Diffusion Processes,” Stochastics, vol. 8, no. 1, pp. 63–77, January 1982. [Online]. Available: https://doi.org/10.1080/17442508208833228
  3. O. B. Hijab, “Minimum energy estimation,” Ph.D. dissertation, University of California, Berkeley, 1980.
  4. R. E. Mortensen, “Maximum-likelihood recursive nonlinear filtering,” Journal of Optimization Theory and Applications, vol. 2, no. 6, pp. 386–394, Nov 1968. [Online]. Available: https://doi.org/10.1007/BF00925744
  5. E. Todorov, “Linearly-solvable markov decision problems,” in Advances in Neural Information Processing Systems, B. Schölkopf, J. Platt, and T. Hoffman, Eds., vol. 19.   MIT Press, 2007. [Online]. Available: https://proceedings.neurips.cc/paper/2006/file/d806ca13ca3449af72a1ea5aedbed26a-Paper.pdf
  6. H. J. Kappen, “Linear theory for control of nonlinear stochastic systems,” Phys. Rev. Lett., vol. 95, p. 200201, Nov 2005. [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevLett.95.200201
  7. ——, “Path integrals and symmetry breaking for optimal control theory,” Journal of Statistical Mechanics: Theory and Experiment, vol. 2005, no. 11, pp. P11 011–P11 011, nov 2005. [Online]. Available: https://doi.org/10.1088/1742-5468/2005/11/p11011
  8. S. Vijayakumar, K. Rawlik, and M. Toussaint, “On stochastic optimal control and reinforcement learning by approximate inference,” in Robotics: Science and Systems VIII, N. Roy, P. Newman, and S. Srinivasa, Eds., 2013, pp. 353–360.
  9. M. Toussaint, “Robot trajectory optimization using approximate inference,” in Proceedings of the 26th Annual International Conference on Machine Learning, ser. ICML ’09.   New York, NY, USA: Association for Computing Machinery, 2009, p. 1049–1056. [Online]. Available: https://doi.org/10.1145/1553374.1553508
  10. C. Hoffmann and P. Rostalski, “Linear optimal control on factor graphs — a message passing perspective —,” IFAC-PapersOnLine, vol. 50, no. 1, pp. 6314–6319, 2017, 20th IFAC World Congress. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S2405896317313800
  11. S. Levine, “Reinforcement learning and control as probabilistic inference: Tutorial and review,” 2018.
  12. D. Maoutsa and M. Opper, “Deterministic particle flows for constraining stochastic nonlinear systems,” Phys. Rev. Res., vol. 4, p. 043035, Oct 2022. [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevResearch.4.043035
  13. S. Reich, “Particle-based algorithm for stochastic optimal control,” 2024.
  14. D. Jacobson, “Optimal stochastic linear systems with exponential performance criteria and their relation to deterministic differential games,” IEEE Transactions on Automatic Control, vol. 18, no. 2, pp. 124–131, 1973.
  15. J. Doyle, “Guaranteed margins for lqg regulators,” IEEE Transactions on Automatic Control, vol. 23, no. 4, pp. 756–757, 1978.
  16. P. Whittle, “Risk-sensitive linear/quadratic/gaussian control,” Advances in Applied Probability, vol. 13, no. 4, pp. 764–777, 1981. [Online]. Available: http://www.jstor.org/stable/1426972
  17. W. H. Fleming, “Risk sensitive stochastic control and differential games,” Communications in Information & Systems, vol. 6, no. 3, pp. 161 – 177, 2006.
  18. P. Whittle, “Risk sensitivity, a strangely pervasive concept,” Macroeconomic Dynamics, vol. 6, no. 1, p. 5–18, 2002.
  19. T. Başar, “Robust designs through risk sensitivity: An overview,” Journal of Systems Science and Complexity, vol. 34, no. 5, pp. 1634–1665, Oct 2021. [Online]. Available: https://doi.org/10.1007/s11424-021-1242-6
  20. A. Biswas and V. S. Borkar, “Ergodic risk-sensitive control—A survey,” Annual Reviews in Control, vol. 55, pp. 118–141, 2023. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1367578823000068
  21. A. Lim and X. Y. Zhou, “A maximum principle for risk-sensitive control,” in 42nd IEEE International Conference on Decision and Control, vol. 6, 2003, pp. 5819–5824 Vol.6.
  22. T. E. Duncan, “Linear-exponential-quadratic gaussian control,” IEEE Transactions on Automatic Control, vol. 58, no. 11, pp. 2910–2911, 2013.
  23. M. R. James, “Asymptotic analysis of nonlinear stochastic risk-sensitive control and differential games,” Mathematics of Control, Signals and Systems, vol. 5, no. 4, pp. 401–417, Dec 1992. [Online]. Available: https://doi.org/10.1007/BF02134013
  24. D. Nualart and É. Pardoux, “Stochastic calculus with anticipating integrands,” Probability Theory and Related Fields, vol. 78, no. 4, pp. 535–581, 1988.
  25. T. Yang, R. Laugesen, P. Mehta, and S. Meyn, “Multivariable feedback particle filter,” Automatica, vol. 71, pp. 10–23, 2016. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S000510981630142X
  26. D. Maoutsa, S. Reich, and M. Opper, “Interacting particle solutions of fokker–planck equations through gradient–log–density estimation,” Entropy, vol. 22, no. 8, 2020. [Online]. Available: https://www.mdpi.com/1099-4300/22/8/802
  27. J. W. Kim and P. G. Mehta, “An optimal control derivation of nonlinear smoothing equations,” in Advances in Dynamics, Optimization and Computation, O. Junge, O. Schütze, G. Froyland, S. Ober-Blöbaum, and K. Padberg-Gehle, Eds.   Cham: Springer International Publishing, 2020, pp. 295–311.

Summary

We haven't generated a summary for this paper yet.