Reinforcement Learning for Online Testing of Autonomous Driving Systems: a Replication and Extension Study (2403.13729v1)
Abstract: In a recent study, Reinforcement Learning (RL) used in combination with many-objective search, has been shown to outperform alternative techniques (random search and many-objective search) for online testing of Deep Neural Network-enabled systems. The empirical evaluation of these techniques was conducted on a state-of-the-art Autonomous Driving System (ADS). This work is a replication and extension of that empirical study. Our replication shows that RL does not outperform pure random test generation in a comparison conducted under the same settings of the original study, but with no confounding factor coming from the way collisions are measured. Our extension aims at eliminating some of the possible reasons for the poor performance of RL observed in our replication: (1) the presence of reward components providing contrasting or useless feedback to the RL agent; (2) the usage of an RL algorithm (Q-learning) which requires discretization of an intrinsically continuous state space. Results show that our new RL agent is able to converge to an effective policy that outperforms random testing. Results also highlight other possible improvements, which open to further investigations on how to best leverage RL for online ADS testing.
- Apollo, B.: https://github.com/ApolloAuto/apollo (2017)
- In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 26 (1), pp. 864–871. AAAI Press, Palo Alto, CA, USA (2012)
- In: 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE), pp. 1016–1026. IEEE, Los Alamitos, CA, USA (2018). DOI 10.1145/3180155.3180160
- In: IEEE 13th International Conference on Software Testing, Validation and Verification (ICST), pp. 375–386. IEEE (2020)
- In: Intelligent Transportation Systems Conference (ITSC), pp. 163–168. IEEE, Los Alamitos, CA, USA (2019). DOI 10.1109/ITSC.2019.8917242
- In: 1st Annual Conference on Robot Learning, Proceedings of Machine Learning Research - PMLR, vol. 78, pp. 1–16. JMLR, Cambridge, MA, USA (2017)
- Dunn, O.J.: Multiple comparisons using rank sums. Technometrics 6(3), 241–252 (1964). DOI 10.1080/00401706.1964.10490181
- PLoS one 12(9), e0184952 (2017)
- Friedman, M.: The use of ranks to avoid the assumption of normality implicit in the analysis of variance. Journal of the American Statistical Association 32(200), 675–701 (1937). DOI 10.1080/01621459.1937.10503522
- In: Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA), pp. 318–328. ACM, New York, NY, USA (2019). DOI 10.1145/3293882.3330566
- In: 44th International Conference on Software Engineering (ICSE), pp. 811–822. ACM (2022)
- In: IEEE/ACM 45th International Conference on Software Engineering (ICSE), pp. 1814–1826. IEEE, Los Alamitos, CA, USA (2023). DOI 10.1109/ICSE48619.2023.00155
- In: IEEE Intelligent Vehicles Symposium (IV), pp. 2352–2358. IEEE (2019)
- In: Intelligent Vehicles Symposium, pp. 1–7. IEEE, Los Alamitos, CA, USA (2018). DOI 10.1109/IVS.2018.8500400
- Leaderboard, C.A.D.: CARLA leaderboard. https://leaderboard.carla.org/leaderboard/ (2020). Accessed: March 20, 2024
- In: 34th Digital Avionics Systems Conference, pp. 6C2–1–6C2–13. IEEE/AIAA, Los Alamitos, CA, USA (2015). DOI 10.1109/DASC.2015.7311450
- Leurent, E.: An environment for autonomous driving decision-making. https://github.com/eleurent/highway-env (2018)
- Leurent, E.: A survey of state-action representations for autonomous driving (2018). URL https://hal.science/hal-01908175
- In: IEEE 31st International Symposium on Software Reliability Engineering (ISSRE), pp. 25–36. IEEE, Los Alamitos, CA, USA (2020). DOI 10.1109/ISSRE5003.2020.00012
- The American Statistician 47, 217–228 (1993). DOI 10.1080/00031305.1993.10475983
- IEEE Transactions on Software Engineering 49(1), 384–402 (2023). DOI 10.1109/TSE.2022.3150788
- CoRR abs/1902.01084 (2019). URL http://arxiv.org/abs/1902.01084
- Nature 518(7540), 529–533 (2015)
- In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7073–7083. IEEE, Los Alamitos, CA, USA (2021). DOI 10.1109/CVPR46437.2021.00700
- In: Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE), pp. 876–888. ACM, New York, NY, USA (2020). DOI 10.1145/3368089.3409730
- In: 23rd International Conference on Intelligent Transportation Systems (ITSC), pp. 1–6. IEEE, Los Alamitos, CA, USA (2020). DOI 10.1109/ITSC45102.2020.9294422
- In: 29th Asia-Pacific Software Engineering Conference (APSEC), pp. 61–70. IEEE, Los Alamitos, CA, USA (2022). DOI 10.1109/APSEC57359.2022.00018
- Empirical Software Engineering 19(3), 501–557 (2014). DOI 10.1007/s10664-012-9227-7
- A Bradford Book, Cambridge, MA, USA (2018)
- ACM Transactions on Software Engineering and Methodology 32(5), 1–62 (2023)
- US Department of Transportation, N.H.T.S.A.: Summary report: Standing general order on crash reporting for automated driving systems. https://www.nhtsa.gov/sites/nhtsa.gov/files/2022-06/ADS-SGO-Report-June-2022.pdf (2022)
- CoRR abs/1804.06760 (2018). URL http://arxiv.org/abs/1804.06760
- Watkins, C.J.C.H.: Learning from delayed rewards. Ph.D. thesis, King’s College, Cambridge, UK (1989)
- In: Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA), pp. 79–90. ACM, New York, NY, USA (2021). DOI 10.1145/3460319.3464811
- Luca Giamattei (3 papers)
- Matteo Biagiola (12 papers)
- Roberto Pietrantuono (8 papers)
- Stefano Russo (9 papers)
- Paolo Tonella (42 papers)