Demonstrating Reinforcement Learning and Run Time Assurance for Spacecraft Inspection Using Unmanned Aerial Vehicles (2405.06770v1)
Abstract: On-orbit spacecraft inspection is an important capability for enabling servicing and manufacturing missions and extending the life of spacecraft. However, as space operations become increasingly more common and complex, autonomous control methods are needed to reduce the burden on operators to individually monitor each mission. In order for autonomous control methods to be used in space, they must exhibit safe behavior that demonstrates robustness to real world disturbances and uncertainty. In this paper, neural network controllers (NNCs) trained with reinforcement learning are used to solve an inspection task, which is a foundational capability for servicing missions. Run time assurance (RTA) is used to assure safety of the NNC in real time, enforcing several different constraints on position and velocity. The NNC and RTA are tested in the real world using unmanned aerial vehicles designed to emulate spacecraft dynamics. The results show this emulation is a useful demonstration of the capability of the NNC and RTA, and the algorithms demonstrate robustness to real world disturbances.
- Hobbs, K. L., Lyons, J. B., Feather, M. S., Bycroft, B. P., Phillips, S., Simon, M., Harter, M., Costello, K., Gawdiak, Y., and Paine, S., “Space Trusted Autonomy Readiness Levels,” 2023 IEEE Aerospace Conference, 2023a, pp. 1–17.
- Silver, D., Huang, A., Maddison, C., Guez, A., Sifre, L., Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., Dieleman, S., Grewe, D., Nham, J., Kalchbrenner, N., Sutskever, I., Lillicrap, T., Leach, M., Kavukcuoglu, K., Graepel, T., and Hassabis, D., “Mastering the Game of Go with Deep Neural Networks and Tree Search,” Nature, Vol. 529, 2016, pp. 484–489. 10.1038/nature16961.
- Vinyals, O., Babuschkin, I., Czarnecki, W. M., Mathieu, M., Dudzik, A., Chung, J., Choi, D. H., Powell, R., Ewalds, T., Georgiev, P., Oh, J., Horgan, D., Kroiss, M., Danihelka, I., Huang, A., Sifre, L., Cai, T., Agapiou, J. P., Jaderberg, M., Vezhnevets, A. S., Leblond, R., Pohlen, T., Dalibard, V., Budden, D., Sulsky, Y., Molloy, J., Paine, T. L., Gulcehre, C., Wang, Z., Pfaff, T., Wu, Y., Ring, R., Yogatama, D., Wünsch, D., McKinney, K., Smith, O., Schaul, T., Lillicrap, T., Kavukcuoglu, K., Hassabis, D., Apps, C., and Silver, D., “Grandmaster Level in StarCraft II using Multi-Agent Reinforcement Learning,” Nature, Vol. 575, No. 7782, 2019, pp. 350–354. 10.1038/s41586-019-1724-z.
- Lei, H. H., Shubert, M., Damron, N., Lang, K., and Phillips, S., “Deep reinforcement Learning for Multi-agent Autonomous Satellite Inspection,” AAS Guidance Navigation and Control Conference, 2022.
- Aurand, J., Lei, H., Cutlip, S., Lang, K., and Phillips, S., “Exposure-Based Multi-Agent Inspection of a Tumbling Target Using Deep Reinforcement Learning,” AAS Guidance Navigation and Control Conference, 2023.
- Brandonisio, A., Lavagna, M., and Guzzetti, D., “Reinforcement Learning for Uncooperative Space Objects Smart Imaging Path-Planning,” The Journal of the Astronautical Sciences, Vol. 68, No. 4, 2021, pp. 1145–1169. 10.1007/s40295-021-00288-7, URL https://doi.org/10.1007/s40295-021-00288-7.
- Oestreich, C. E., Linares, R., and Gondhalekar, R., “Autonomous six-degree-of-freedom spacecraft docking with rotating targets via reinforcement learning,” Journal of Aerospace Information Systems, Vol. 18, No. 7, 2021, pp. 417–428.
- Broida, J., and Linares, R., “Spacecraft Rendezvous Guidance in Cluttered Environments via Reinforcement Learning,” 29th AAS/AIAA Space Flight Mechanics Meeting, 2019, pp. 1–15.
- Chen, V., Phillips, S. A., , and Copp, D. A., “Planning Autonomous Spacecraft Rendezvous Docking Trajectories via Reinforcement Learning,” American Astronomical Society Guidance Navigation and Control Conference, AAS, 2023, pp. 1–16.
- Hovell, K., and Ulrich, S., “Deep reinforcement learning for spacecraft proximity operations guidance,” Journal of spacecraft and rockets, Vol. 58, No. 2, 2021, pp. 254–264.
- Stephenson, M., Mantovani, L., Phillips, S., and Schaub, H., “Using Enhanced Simulation Environments to Accelerate Reinforcement Learning for Long-Duration Satellite Autonomy,” AIAA SCITECH 2024 Forum, 2024, p. 0990.
- Harris, A., Valade, T., Teil, T., and Schaub, H., “Generation of spacecraft operations procedures using deep reinforcement learning,” Journal of Spacecraft and Rockets, Vol. 59, No. 2, 2022, pp. 611–626.
- Ames, A. D., Xu, X., Grizzle, J. W., and Tabuada, P., “Control barrier function based quadratic programs for safety critical systems,” IEEE Transactions on Automatic Control, Vol. 62, No. 8, 2016, pp. 3861–3876.
- Squires, E., Pierpaoli, P., Konda, R., Coogan, S., and Egerstedt, M., “Composition of safety constraints for fixed-wing collision avoidance amidst limited communications,” Journal of Guidance, Control, and Dynamics, Vol. 45, No. 4, 2022, pp. 714–725.
- Breeden, J., and Panagou, D., “Guaranteed Safe Spacecraft Docking With Control Barrier Functions,” IEEE Control Systems Letters, Vol. 6, 2021, pp. 2000–2005.
- Agrawal, D. R., and Panagou, D., “Safe control synthesis via input constrained control barrier functions,” 2021 60th IEEE Conference on Decision and Control (CDC), IEEE, 2021, pp. 6113–6118.
- Mote, M. L., Hays, C. W., Collins, A., Feron, E., and Hobbs, K. L., “Natural Motion-based Trajectories for Automatic Spacecraft Collision Avoidance During Proximity Operations,” 2021 IEEE Aerospace Conference (50100), IEEE, 2021, pp. 1–12.
- McQuinn, C.-K., Dunlap, K., Hamilton, N., Wilson, J., and Hobbs, K. L., “Run Time Assurance for Simultaneous Constraint Satisfaction During Spacecraft Attitude Maneuvering,” 2024 IEEE Aerospace Conference, IEEE, 2024, pp. 1–13.
- Hamilton, N., Dunlap, K., Johnson, T. T., and Hobbs, K. L., “Ablation study of how run time assurance impacts the training and performance of reinforcement learning agents,” 2023 IEEE 9th International Conference on Space Mission Challenges for Information Technology (SMC-IT), IEEE, 2023, pp. 45–55.
- Dunlap, K., Mote, M., Delsing, K., and Hobbs, K. L., “Run time assured reinforcement learning for safe satellite docking,” Journal of Aerospace Information Systems, Vol. 20, No. 1, 2023a, pp. 25–36.
- Hamilton, N., Musau, P., Lopez, D. M., and Johnson, T. T., “Zero-Shot Policy Transfer in Autonomous Racing: Reinforcement Learning vs Imitation Learning,” 2022 IEEE International Conference on Assured Autonomy (ICAA, 2022, pp. 11–20. 10.1109/ICAA52185.2022.00011.
- van Wijk, D., Dunlap, K., Majji, M., and Hobbs, K., “Deep Reinforcement Learning for Autonomous Spacecraft Inspection using Illumination,” AAS/AIAA Astrodynamics Specialist Conference, Big Sky, Montana, 2023.
- Dunlap, K., van Wijk, D., and Hobbs, K. L., “Run Time Assurance for Autonomous Spacecraft Inspection,” 2023 AAS/AIAA Astrodynamics Specialist Conference, Big Sky, MT, 2023b.
- Hamilton, N., Schlemmer, L., Menart, C., Waddington, C., Jenkins, T., and Johnson, T. T., “Sonic to knuckles: evaluations on transfer reinforcement learning,” Unmanned Systems Technology XXII, Vol. 11425, International Society for Optics and Photonics, 2020, p. 114250J.
- Alshiekh, M., Bloem, R., Ehlers, R., Könighofer, B., Niekum, S., and Topcu, U., “Safe reinforcement learning via shielding,” Thirty-Second AAAI Conference on Artificial Intelligence, 2018, pp. 2669–2678.
- Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., and Zaremba, W., “Openai gym,” arXiv preprint arXiv:1606.01540, 2016.
- Fisac, J. F., Akametalu, A. K., Zeilinger, M. N., Kaynama, S., Gillula, J., and Tomlin, C. J., “A general safety framework for learning-based control in uncertain robotic systems,” IEEE Transactions on Automatic Control, Vol. 64, No. 7, 2018, pp. 2737–2752.
- Henderson, P., Islam, R., Bachman, P., Pineau, J., Precup, D., and Meger, D., “Deep reinforcement learning that matters,” Proceedings of the AAAI conference on artificial intelligence, Vol. 32, 2018, pp. 3207–3214.
- Mania, H., Guy, A., and Recht, B., “Simple random search of static linear policies is competitive for reinforcement learning,” Proceedings of the 32nd International Conference on Neural Information Processing Systems, 2018, pp. 1805–1814.
- Jang, K., Vinitsky, E., Chalaki, B., Remer, B., Beaver, L., Malikopoulos, A. A., and Bayen, A., “Simulation to scaled city: zero-shot policy transfer for traffic control via autonomous vehicles,” Proceedings of the 10th ACM/IEEE International Conference on Cyber-Physical Systems, 2019, pp. 291–300.
- Bernini, N., Bessa, M., Delmas, R., Gold, A., Goubault, E., Pennec, R., Putot, S., and Sillion, F., “A few lessons learned in reinforcement learning for quadcopter attitude control,” Proceedings of the 24th International Conference on Hybrid Systems: Computation and Control, 2021, pp. 1–11.
- Silver, D., “Lectures on Reinforcement Learning,” url: https://www.davidsilver.uk/teaching/, 2015.
- Schulman, J., Wolski, F., Dhariwal, P., Radford, A., and Klimov, O., “Proximal policy optimization algorithms,” arXiv preprint arXiv:1707.06347, 2017.
- Hobbs, K. L., Mote, M. L., Abate, M. C., Coogan, S. D., and Feron, E. M., “Runtime Assurance for Safety-Critical Systems: An Introduction to Safety Filtering Approaches for Complex Control Systems,” IEEE Control Systems Magazine, Vol. 43, No. 2, 2023b, pp. 28–65.
- Ames, A. D., Coogan, S., Egerstedt, M., Notomista, G., Sreenath, K., and Tabuada, P., “Control barrier functions: Theory and applications,” 2019 18th European control conference (ECC), IEEE, 2019, pp. 3420–3431.
- Nagumo, M., “Über die lage der integralkurven gewöhnlicher differentialgleichungen,” Proceedings of the Physico-Mathematical Society of Japan. 3rd Series, Vol. 24, 1942, pp. 551–559.
- Gurriet, T., Singletary, A., Reher, J., Ciarletta, L., Feron, E., and Ames, A., “Towards a Framework for Realizable Safety Critical Control through Active Set Invariance,” 2018 ACM/IEEE 9th International Conference on Cyber-Physical Systems (ICCPS), 2018, pp. 98–106. 10.1109/ICCPS.2018.00018.
- Phillips, S., Lippay, Z., Baker, D., Soderlund, A. A., and Shubert, M., “Emulation of Close-Proximity Spacecraft Dynamics in Terrestrial Environments Using Unmanned Aerial Vehicles,” AIAA SCITECH 2024 Forum, 2024, p. 1207.
- Hill, G. W., “Researches in the Lunar Theory,” American journal of Mathematics, Vol. 1, No. 1, 1878, pp. 5–26.
- Clohessy, W., and Wiltshire, R., “Terminal Guidance System for Satellite Rendezvous,” Journal of the Aerospace Sciences, Vol. 27, No. 9, 1960, pp. 653–658.
- Jankovic, M., “Robust control barrier functions for constrained stabilization of nonlinear systems,” Automatica, Vol. 96, 2018, pp. 359–367.