Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Sim-to-Real Transfer of Adaptive Control Parameters for AUV Stabilization under Current Disturbance (2310.11075v1)

Published 17 Oct 2023 in cs.RO, cs.AI, cs.SY, and eess.SY

Abstract: Learning-based adaptive control methods hold the premise of enabling autonomous agents to reduce the effect of process variations with minimal human intervention. However, its application to autonomous underwater vehicles (AUVs) has so far been restricted due to 1) unknown dynamics under the form of sea current disturbance that we can not model properly nor measure due to limited sensor capability and 2) the nonlinearity of AUVs tasks where the controller response at some operating points must be overly conservative in order to satisfy the specification at other operating points. Deep Reinforcement Learning (DRL) can alleviates these limitations by training general-purpose neural network policies, but applications of DRL algorithms to AUVs have been restricted to simulated environments, due to their inherent high sample complexity and distribution shift problem. This paper presents a novel approach, merging the Maximum Entropy Deep Reinforcement Learning framework with a classic model-based control architecture, to formulate an adaptive controller. Within this framework, we introduce a Sim-to-Real transfer strategy comprising the following components: a bio-inspired experience replay mechanism, an enhanced domain randomisation technique, and an evaluation protocol executed on a physical platform. Our experimental assessments demonstrate that this method effectively learns proficient policies from suboptimal simulated models of the AUV, resulting in control performance 3 times higher when transferred to a real-world vehicle, compared to its model-based nonadaptive but optimal counterpart.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (71)
  1. In: Chaudhuri K and Salakhutdinov R (eds.) Proceedings of the 36th International Conference on Machine Learning, Proceedings of Machine Learning Research, volume 97. PMLR, pp. 151–160.
  2. Proceedings of OCEANS 2005 MTS/IEEE : 1–6.
  3. Åström KJ and Murray RM (2021) Feedback systems: an introduction for scientists and engineers. Princeton university press.
  4. Ba J, Kiros JR and Hinton GE (2016) Layer normalization. ArXiv abs/1607.06450.
  5. Bacciotti A and Rosier L (2001) Liapunov functions and stability in control theory. Springer Berlin, Heidelberg.
  6. Remote. Sens. 12: 2588.
  7. Benosman M (2017) Learning-Based Adaptive Control: An Extremum Seeking Approach - Theory and Applications. Butterworth-Heinemann. https://doi.org/10.1016/C2014-0-03287-X.
  8. Berchtold S, Böhm C and Kriegal HP (1998) The pyramid-technique: Towards breaking the curse of dimensionality. SIGMOD Rec. 27(2): 142–153. 10.1145/276305.276318.
  9. Blue Robotics Inc (2015) bluerov-ros-pkg. https://github.com/bluerobotics/bluerov-ros-pkg.
  10. Blue Robotics Inc (2017a) BlueROV2. http://docs.bluerobotics.com/brov2/.
  11. Blue Robotics Inc (2017b) T200 Thruster. https://bluerobotics.com/store/thrusters/t100-t200-thrusters/t200-thruster-r2-rp/.
  12. IFAC-PapersOnLine 54(16): 333–340. 13th IFAC Conference on Control Applications in Marine Systems, Robotics, and Vehicles CAMS.
  13. In: 17th International Conference on Informatics, Automation and Robotics, ICINCO 2020, Proceedings of the 17th International Conference on Informatics, Automation and Robotics, ICINCO 2020. Virtual, Online, France, pp. 314–323.
  14. In: Informatics in Control, Automation and Robotics. Springer, pp. 362–385.
  15. TechRxiv .
  16. ArXiv abs/2110.03239.
  17. IET Intelligent Transport Systems 14(7): 764–774. https://doi.org/10.1049/iet-its.2019.0273.
  18. Dankwa S and Zheng W (2019) Twin-delayed ddpg: A deep reinforcement learning technique to model a continuous movement of an intelligent robot agent. Proceedings of the 3rd International Conference on Vision, Image and Signal Processing .
  19. de Larminat P (2009) Automatique appliquée (2° Éd. revue et augmentée). Hermes Science.
  20. Doyle JC (1995a) Robust and optimal control. Proceedings of 35th IEEE Conference on Decision and Control 2: 1595–1598 vol.2.
  21. Doyle JC (1995b) Robust and optimal control. Communication and Control Engineering Series .
  22. ArXiv abs/2003.11881.
  23. Ermakov V (2019) mavros. http://wiki.ros.org/mavros.
  24. International Conference on Machine Learning .
  25. Eysenbach B and Levine S (2021) Maximum entropy rl (provably) solves some robust rl problems. ArXiv abs/2103.06257.
  26. ArXiv abs/2007.06700.
  27. Fossen T (1994) Nonlinear Modelling And Control Of Underwater Vehicles. PhD Thesis, NUST.
  28. Fujimoto S, van Hoof H and Meger D (2018) Addressing function approximation error in actor-critic methods. ArXiv abs/1802.09477.
  29. García J and Fernández F (2015) A comprehensive survey on safe reinforcement learning. J. Mach. Learn. Res. 16(1): 1437–1480.
  30. Garcia J and Fernández F (2015) A comprehensive survey on safe reinforcement learning. J. Mach. Learn. Res. 16: 1437–1480.
  31. ArXiv abs/2107.06277.
  32. Gilmour B, Niccum G and O’Donnell T (2012) Field resident auv systems — chevron’s long-term goal for auv development. 2012 IEEE/OES Autonomous Underwater Vehicles (AUV) : 1–5.
  33. In: Dy J and Krause A (eds.) Proceedings of the 35th International Conference on Machine Learning, volume 80. pp. 1861–1870.
  34. In: Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 10-15, 2018, volume 80. PMLR, pp. 1856–1865.
  35. CoRR abs/1812.11103.
  36. Veh. Commun. 39: 100551.
  37. CoRR abs/2301.01755. 10.48550/arXiv.2301.01755.
  38. In: Chaudhuri K, Jegelka S, Song L, Szepesvari C, Niu G and Sabato S (eds.) Proceedings of the 39th International Conference on Machine Learning, Proceedings of Machine Learning Research, volume 162. PMLR, pp. 10708–10733.
  39. Kingma DP and Ba J (2014) Adam: A method for stochastic optimization. CoRR abs/1412.6980.
  40. Knudsen KB, Nielsen MC and Schjølberg I (2019) Deep learning for station keeping of auvs. OCEANS 2019 MTS/IEEE SEATTLE : 1–6.
  41. IFAC-PapersOnLine .
  42. Kostrikov I, Nair A and Levine S (2021) Offline reinforcement learning with implicit q-learning. ArXiv abs/2110.06169.
  43. Neural Information Processing Systems .
  44. The Eleventh International Conference on Learning Representations .
  45. Lin L (2004) Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine Learning 8: 293–321.
  46. Liu C, Xu X and Hu D (2015) Multiobjective reinforcement learning: A comprehensive overview. Systems, Man, and Cybernetics: Systems, IEEE Transactions on 45: 385–398. 10.1109/TSMC.2014.2358639.
  47. 2018 IEEE International Conference on Robotics and Automation (ICRA) : 1118–1125.
  48. MTS/IEEE OCEANS .
  49. Marani G, Choi S and Yuh J (2009) Underwater autonomous manipulation for intervention missions auvs. Ocean Engineering 36: 15–23.
  50. Moore T and Stouch D (2014) A generalized extended kalman filter implementation for the robot operating system. Proceedings of the 13th International Conference on Intelligent Autonomous Systems (IAS-13) .
  51. Niekum S and Saito I (2019) ar_track_alvar. http://wiki.ros.org/ar_track_alvar.
  52. ACM Trans. Graph. 37(4).
  53. 6th International Conference on Learning Representations, ICLR .
  54. PX4 Dev Team (2019) Pixhawk 1 Flight Controller. https://docs.px4.io/v1.9.0/en/flight_controller/pixhawk.html.
  55. QGroundControl (2019) Qgroundcontrol. http://qgroundcontrol.com/.
  56. ICRA Workshop on Open Source Software .
  57. Sandøy SS (2016) System Identification and State Estimation for ROV uDrone. Master’s Thesis, NTNU.
  58. Spieker H (2021) Constraint-guided reinforcement learning: Augmenting the agent-environment-interaction. 2021 International Joint Conference on Neural Networks (IJCNN) : 1–8.
  59. OCEANS 2015 - MTS/IEEE Washington : 1–5.
  60. Sutton RS and Barto AG (2018) Reinforcement learning an introduction - Second edition. MIT Press.
  61. Advances in Neural Information Processing Systems 12.
  62. Proc. of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) : 23–30.
  63. VTT Technical Research Centre of Finland Ltd (2019) ALVAR. http://virtual.vtt.fi/virtual/proj2/multimedia/alvar/index.html.
  64. OCEANS 2018 MTS/IEEE Charleston : 1–6.
  65. Sensors (Basel, Switzerland) 23.
  66. Wu CJ (2018a) 6-DoF Modelling and Control of a Remotely Operated Vehicle. Msc thesis, Flinders University.
  67. Wu CJ (2018b) 6-DoF Modelling and Control of a Remotely Operated Vehicle. Master’s Thesis, Flinders University, College of Science and Engineering.
  68. Journal of Intelligent and Robotic Systems .
  69. In: IEEE Intelligent Vehicles Symposium (IV). IEEE, pp. 1073–1080.
  70. 2017 36th Chinese Control Conference (CCC) : 4958–4965.
  71. Zhao W, Queralta JP and Westerlund T (2020) Sim-to-real transfer in deep reinforcement learning for robotics: a survey. CoRR abs/2009.13303.

Summary

We haven't generated a summary for this paper yet.