Effect of Optimizer, Initializer, and Architecture of Hypernetworks on Continual Learning from Demonstration (2401.00524v1)
Abstract: In continual learning from demonstration (CLfD), a robot learns a sequence of real-world motion skills continually from human demonstrations. Recently, hypernetworks have been successful in solving this problem. In this paper, we perform an exploratory study of the effects of different optimizers, initializers, and network architectures on the continual learning performance of hypernetworks for CLfD. Our results show that adaptive learning rate optimizers work well, but initializers specially designed for hypernetworks offer no advantages for CLfD. We also show that hypernetworks that are capable of stable trajectory predictions are robust to different network architectures. Our open-source code is available at https://github.com/sebastianbergner/ExploringCLFD.
- A. Billard, S. Calinon, and R. Dillmann, “Learning from humans,” Springer Handbook of Robotics, 2nd Ed., 2016.
- S. Auddy, J. Hollenstein, M. Saveriano, A. Rodríguez-Sánchez, and J. Piater, “Continual learning from demonstration of robotics skills,” Robotics and Autonomous Systems, vol. 165, p. 104427, 2023. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0921889023000660
- S. Auddy, J. Hollenstein, M. Saveriano, A. Rodríguez-Sánchez, and J. Piater, “Scalable and efficient continual learning from demonstration via hypernetwork-generated stable dynamics model,” arXiv preprint arXiv:2311.03600, 2023.
- J. Z. Kolter and G. Manek, “Learning stable deep dynamics models,” Advances in Neural Information Processing Systems, vol. 32, pp. 11 128–11 136, 2019.
- Y. Huang, K. Xie, H. Bharadhwaj, and F. Shkurti, “Continual model-based reinforcement learning with hypernetworks,” in 2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2021, pp. 799–805.
- P. Schöpf, S. Auddy, J. Hollenstein, and A. Rodriguez-Sanchez, “Hypernetwork-ppo for continual reinforcement learning,” in Deep Reinforcement Learning Workshop NeurIPS 2022, 2022.
- G. I. Parisi, R. Kemker, J. L. Part, C. Kanan, and S. Wermter, “Continual lifelong learning with neural networks: A review,” Neural Networks, vol. 113, pp. 54–71, 2019.
- O. Chang, L. Flokas, and H. Lipson, “Principled weight initialization for hypernetworks,” in 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net, 2020. [Online]. Available: https://openreview.net/forum?id=H1lma24tPB
- K. He, X. Zhang, S. Ren, and J. Sun, “Delving deep into rectifiers: Surpassing human-level performance on imagenet classification,” in Proceedings of the IEEE international conference on computer vision, 2015, pp. 1026–1034.
- J. Urain, M. Ginesi, D. Tateo, and J. Peters, “Imitationflow: Learning deep stable stochastic dynamic systems by normalizing flows,” in 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2020, pp. 5231–5237.
- R. T. Chen, Y. Rubanova, J. Bettencourt, and D. Duvenaud, “Neural ordinary differential equations,” in Proceedings of the 32nd International Conference on Neural Information Processing Systems, 2018, pp. 6572–6583.
- J. von Oswald, C. Henning, J. Sacramento, and B. F. Grewe, “Continual learning with hypernetworks,” in International Conference on Learning Representations (ICLR), 2019.
- T. Tieleman and G. Hinton, “6.5-rmsprop coursera: Neural networks for machine learning university of toronto,” Tech. Rep., 2012.
- D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
- J. Duchi, E. Hazan, and Y. Singer, “Adaptive subgradient methods for online learning and stochastic optimization.” Journal of machine learning research, vol. 12, no. 7, 2011.
- X. Glorot and Y. Bengio, “Understanding the difficulty of training deep feedforward neural networks,” in Proceedings of the thirteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conference Proceedings, 2010, pp. 249–256.