Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Control-Theoretic Techniques for Online Adaptation of Deep Neural Networks in Dynamical Systems (2402.00761v1)

Published 1 Feb 2024 in cs.LG, cs.NE, cs.RO, cs.SY, and eess.SY

Abstract: Deep neural networks (DNNs), trained with gradient-based optimization and backpropagation, are currently the primary tool in modern artificial intelligence, machine learning, and data science. In many applications, DNNs are trained offline, through supervised learning or reinforcement learning, and deployed online for inference. However, training DNNs with standard backpropagation and gradient-based optimization gives no intrinsic performance guarantees or bounds on the DNN, which is essential for applications such as controls. Additionally, many offline-training and online-inference problems, such as sim2real transfer of reinforcement learning policies, experience domain shift from the training distribution to the real-world distribution. To address these stability and transfer learning issues, we propose using techniques from control theory to update DNN parameters online. We formulate the fully-connected feedforward DNN as a continuous-time dynamical system, and we propose novel last-layer update laws that guarantee desirable error convergence under various conditions on the time derivative of the DNN input vector. We further show that training the DNN under spectral normalization controls the upper bound of the error trajectories of the online DNN predictions, which is desirable when numerically differentiated quantities or noisy state measurements are input to the DNN. The proposed online DNN adaptation laws are validated in simulation to learn the dynamics of the Van der Pol system under domain shift, where parameters are varied in inference from the training dataset. The simulations demonstrate the effectiveness of using control-theoretic techniques to derive performance improvements and guarantees in DNN-based learning systems.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (42)
  1. A. Ramesh, P. Dhariwal, A. Nichol, C. Chu, and M. Chen, “Hierarchical text-conditional image generation with clip latents," arXiv preprint arXiv:2204.06125, 2022.
  2. D. Rumelhart, G. Hinton, and R. Williams, “Learning representations by back-propagating errors," Nature, vol. 323, pp. 533–536, 1986.
  3. P. Werbos, “Backpropagation: Past and future," in IEEE 1988 International Conference on Neural Networks, pp. 343-353, 1988.
  4. A. Torralba and A. A. Efros, “Unbiased look at dataset bias," CVPR 2011, Colorado Springs, CO, USA, 2011, pp. 1521-1528.
  5. S. C.H. Hoi, D. Sahoo, J. Lu, and P. Zhao, “Online learning: A comprehensive survey," Neurocomputing, vol. 459, pp. 249-289, 2021.
  6. Y. Mansour, M. Mohri, and A. Rostamizadeh, “Domain adaptation: Learning bounds and algorithms," Conference on Learning Theory (COLT), Montreal, Canada, 2009.
  7. D. Kim, K. Wang, S. Sclaroff, and K. Saenko, “A broad study of pre-training for domain generalization and adaptation," European Conference on Computer Vision, Tel Aviv, Israel, 2022, pp. 621-638.
  8. I. Gulrajani and D. Lopez-Paz, “In search of lost domain generalization," arXiv preprint arXiv:2007.01434, 2020.
  9. B. Lim, and S. Zohren, “Time-series forecasting with deep learning: a survey," Philosophical Transactions of the Royal Society A, vol. 379, no. 2194, 2021.
  10. E. Salvato, G. Fenu, E. Medvet and F. A. Pellegrino, “Crossing the reality gap: A survey on sim-to-real transferability of robot controllers in reinforcement learning," in IEEE Access, vol. 9, pp. 153171-153187, 2021.
  11. W. Zhao, J. P. Queralta and T. Westerlund, “Sim-to-real transfer in deep reinforcement learning for robotics: A survey," 2020 IEEE Symposium Series on Computational Intelligence (SSCI), Canberra, ACT, Australia, 2020, pp. 737-744.
  12. M. Breyer, F. Furrer, T. Novkovic, R. Siegwart, and J. Nieto, “Flexible robotic grasping with sim-to-real transfer based reinforcement learning," arXiv preprint arXiv:1803.04996, 2018.
  13. J. Tobin, R. Fong, A. Ray, J. Schneider, W. Zaremba and P. Abbeel, “Domain randomization for transferring deep neural networks from simulation to the real world," 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, BC, Canada, 2017, pp. 23-30.
  14. S. James, A. J. Davison, and E. Johns, “Transferring end-to-end visuomotor control from simulation to real world for a multi-stage task," 2017 Conference on Robot Learning (CoRL), pp. 334-343. PMLR, 2017.
  15. L. Pinto, J. Davidson, R. Sukthankar, and A. Gupta, “Robust adversarial reinforcement learning,” in Proc. 34th Int. Conf. Mach. Learn., 2017, pp. 2817–2826.
  16. X. Pan, D. Seita, Y. Gao, and J. Canny, “Risk averse robust adversarial reinforcement learning,” in Proc. Int. Conf. Robot. Automat. (ICRA), May 2019, pp. 8522–8528.
  17. K. Arndt, M. Hazara, A. Ghadirzadeh and V. Kyrki, “Meta reinforcement learning for sim-to-real domain adaptation," in Proc. 1st Annu. Conf. Robot Learn. (CoRL), Mountain View, CA, USA, Nov. 2017, pp. 334–343.
  18. T. LaBonte, V. Muthukumar, and A. Kumar, “Towards last-layer retraining for group robustness with fewer annotations," Advances in Neural Information Processing Systems (NeurIPS), New Orleans, LA, USA, 2023.
  19. Y. Guo, H. Shi, A. Kumar, K. Grauman, T. Rosing and R. Feris, “SpotTune: Transfer learning through adaptive fine-tuning," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019, pp. 4800-4809.
  20. H. Azizpour, A. S. Razavian, J. Sullivan, A. Maki and S. Carlsson, “Factors of transferability for a generic ConvNet representation," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, no. 9, pp. 1790-1802, Sept. 2016.
  21. A. Levant, “Sliding order and sliding accuracy in sliding mode control,” Int. J. Control, vol. 58, no. 6, pp. 1247–1263, 1993.
  22. J. A. Moreno and M. Osorio, “Strict lyapunov functions for the super-twisting algorithm," in IEEE Transactions on Automatic Control, vol. 57, no. 4, pp. 1035-1040, April 2012.
  23. P. L. Bartlett, D. J. Foster, and M. J. Telgarsky, “Spectrally-normalized margin bounds for neural networks," Advances in Neural Information Processing Systems (NeurIPS), Long Beach, CA, USA, 2017.
  24. Y. Yoshida, and T. Miyato. “Spectral norm regularization for improving the generalizability of deep learning," arXiv preprint arXiv:1705.10941, 2017.
  25. T. Miyato, T. Kataoka, M. Koyama, and Y. Yoshida. “Spectral normalization for generative adversarial networks," 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018.
  26. P. J. Werbos, “Neural networks for control and system identification," Proceedings of the 28th IEEE Conference on Decision and Control, Tampa, FL, USA, 1989, pp. 260-265 vol. 1.
  27. K. S. Narendra and K. Parthasarathy, “Identification and control of dynamical systems using neural networks," in IEEE Transactions on Neural Networks, vol. 1, no. 1, pp. 4-27, March 1990.
  28. F. L. Lewis, A. Yesildirek and Kai Liu, “Multilayer neural-net robot controller with guaranteed tracking performance," in IEEE Transactions on Neural Networks, vol. 7, no. 2, pp. 388-399, March 1996.
  29. R. M. Sanner and J. . -J. E. Slotine, “Gaussian networks for direct adaptive control," in IEEE Transactions on Neural Networks, vol. 3, no. 6, pp. 837-863, Nov. 1992.
  30. Fu-Chuang Chen and H. K. Khalil, “Adaptive control of a class of nonlinear discrete-time systems using neural networks," in IEEE Transactions on Automatic Control, vol. 40, no. 5, pp. 791-801, May 1995.
  31. S. Jagannathan and F. L. Lewis, “Discrete-time neural net controller for a class of nonlinear dynamical systems," in IEEE Transactions on Automatic Control, vol. 41, no. 11, pp. 1693-1699, Nov. 1996.
  32. A. Sahoo, H. Xu and S. Jagannathan, “Neural network-based adaptive event-triggered control of affine nonlinear discrete time systems with unknown internal dynamics," 2013 American Control Conference, Washington, DC, USA, 2013, pp. 6418-6423.
  33. A. Sahoo, H. Xu and S. Jagannathan, “Adaptive neural network-based event-triggered control of single-input single-output nonlinear discrete-time systems," in IEEE Transactions on Neural Networks and Learning Systems, vol. 27, no. 1, pp. 151-164, Jan. 2016.
  34. O. S. Patil, D. M. Le, M. L. Greene and W. E. Dixon, “Lyapunov-derived control and adaptive update laws for inner and outer layer weights of a deep neural network," in IEEE Control Systems Letters, vol. 6, pp. 1855-1860, 2022.
  35. J.G. Elkins, R. Sood, and C. Rumpf, “Bridging reinforcement learning and online learning for spacecraft attitude control,” Journal of Aerospace Information Systems vol. 19, no. 1, pp. 62-69, 2022.
  36. G. Joshi, J. Virdi, and G. Chowdhary, “Asynchronous deep model reference adaptive control," In Conference on Robot Learning (CoRL), 2021, pp. 984-1000.
  37. R. Sun, M. L. Greene, D. M. Le, Z. I. Bell, G. Chowdhary and W. E. Dixon, “Lyapunov-based real-time and iterative adjustment of deep neural networks," in IEEE Control Systems Letters, vol. 6, pp. 193-198, 2022.
  38. A. Levant, “Robust exact differentiation via sliding modes techniques,” Automatica, vol. 34, no. 3, pp. 379–384, 1998.
  39. L. Fridman and A. Levant, “Higher order sliding modes,” in Sliding Mode Control in Engineering, J. P. Barbot and W. Perruquetti, New York: Marcel Dekker, 2002, pp. 53–101.
  40. A. Levant, “Homogeneity approach to high-order sliding mode design,” Automatica, no. 41, pp. 823–830, 2005.
  41. A. Levant, “Principles of 2-sliding mode design,” Automatica, no. 43, pp. 576–586, 2007.
  42. Y. Orlov, “Finite time stability and robust control synthesis of uncertain switched systems,” SIAM J. Control Optim., vol. 43, no. 4, pp. 1253–1271, 2005.

Summary

We haven't generated a summary for this paper yet.