Papers
Topics
Authors
Recent
2000 character limit reached

On the continuity and smoothness of the value function in reinforcement learning and optimal control (2403.14432v1)

Published 21 Mar 2024 in eess.SY, cs.AI, and cs.SY

Abstract: The value function plays a crucial role as a measure for the cumulative future reward an agent receives in both reinforcement learning and optimal control. It is therefore of interest to study how similar the values of neighboring states are, i.e., to investigate the continuity of the value function. We do so by providing and verifying upper bounds on the value function's modulus of continuity. Additionally, we show that the value function is always H\"older continuous under relatively weak assumptions on the underlying system and that non-differentiable value functions can be made differentiable by slightly "disturbing" the system.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (17)
  1. Ludwig Arnold. Random Dynamical Systems, volume 1609, pages 1–43. Springer Berlin Heidelberg, Berlin, Heidelberg, 1995.
  2. Optimal Control and Viscosity Solutions of Hamilton-Jacobi-Bellman Equations. Birkhäuser Boston, Boston, MA, 1997.
  3. Lipschitz Continuity of the Value Function for the Infinite Horizon Optimal Control Problem Under State Constraints. In Fatiha Alabau-Boussouira, Fabio Ancona, Alessio Porretta, and Carlo Sinestrari, editors, Trends in Control Theory and Partial Differential Equations, volume 32, pages 17–38. Springer International Publishing, Cham, 2019.
  4. Adaptive-resolution reinforcement learning with polynomial exploration in deterministic domains. Machine Learning, 81(3):359–397, December 2010.
  5. Justin A. Boyan. Technical Update: Least-Squares Temporal Difference Learning. Machine Learning, 49(2/3):233–246, 2002.
  6. An analysis of stochastic flows. Communications on Stochastic Analysis, 8(3), September 2014.
  7. Andrey A. Dorogovtsev. Measure-Valued Processes and Stochastic Flows. De Gruyter, October 2023.
  8. Minyi Huang. Uniqueness of Constrained Viscosity Solutions in Hybrid Control Systems. SIAM Journal on Control and Optimization, 46(1):332–355, January 2007.
  9. Continuity of the Value Function for Stochastic Sparse Optimal Control. IFAC-PapersOnLine, 53(2):7179–7184, 2020.
  10. Continuous control with deep reinforcement learning. In Yoshua Bengio and Yann LeCun, editors, 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings, 2016.
  11. Differentiability of the value function without interiority assumptions. Journal of Economic Theory, 144(5):1948–1964, September 2009.
  12. Deterministic Policy Gradient Algorithms. In Proceedings of the 31st International Conference on International Conference on Machine Learning, volume 32, pages I–387–I–395, Beijing, China, 2014. JMLR.org.
  13. On the smoothness of value functions and the existence of optimal strategies in diffusion models. Journal of Economic Theory, 159:1016–1055, September 2015.
  14. Reinforcement Learning: An Introduction. Adaptive Computation and Machine Learning. MIT Press, Cambridge, Mass, 1998.
  15. Fast gradient-descent methods for temporal-difference learning with linear function approximation. In Proceedings of the 26th Annual International Conference on Machine Learning, pages 993–1000, Montreal Quebec Canada, June 2009. ACM.
  16. J.N. Tsitsiklis and B. Van Roy. An analysis of temporal-difference learning with function approximation. IEEE Transactions on Automatic Control, 42(5):674–690, May 1997.
  17. Weierstrass’s function and chaos. Hokkaido Mathematical Journal, 12(3), October 1983.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.