SUPERNOVA: Automating Test Selection and Defect Prevention in AAA Video Games Using Risk Based Testing and Machine Learning (2203.05566v2)
Abstract: Testing video games is an increasingly difficult task as traditional methods fail to scale with growing software systems. Manual testing is a very labor-intensive process, and therefore quickly becomes cost prohibitive. Using scripts for automated testing is affordable, however scripts are ineffective in non-deterministic environments, and knowing when to run each test is another problem altogether. The modern game's complexity, scope, and player expectations are rapidly increasing where quality control is a big portion of the production cost and delivery risk. Reducing this risk and making production happen is a big challenge for the industry currently. To keep production costs realistic up-to and after release, we are focusing on preventive quality assurance tactics alongside testing and data analysis automation. We present SUPERNOVA (Selection of tests and Universal defect Prevention in External Repositories for Novel Objective Verification of software Anomalies), a system responsible for test selection and defect prevention while also functioning as an automation hub. By integrating data analysis functionality with machine and deep learning capability, SUPERNOVA assists quality assurance testers in finding bugs and developers in reducing defects, which improves stability during the production cycle and keeps testing costs under control. The direct impact of this has been observed to be a reduction in 55% or more testing hours for an undisclosed sports game title that has shipped, which was using these test selection optimizations. Furthermore, using risk scores generated by a semi-supervised machine learning model, we are able to detect with 71% precision and 77% recall the probability of a change-list being bug inducing, and provide a detailed breakdown of this inference to developers. These efforts improve workflow and reduce testing hours required on game titles in development.
- J. Munro, C. Boldyreff, and A. Capiluppi, “Architectural Studies of Games Engines — The Quake Series,” in 2009 International IEEE Consumer Electronics Society’s Games Innovations Conference, pp. 246–255, 2009.
- M. Culibrk, D. Ispir, and A. Senchenko, “Optimized Test Case Selection For Quality Assurance Testing Of Video Games,” 2020. US 20200310948 A1.
- Z. Ereiz et al., “Automating Web Application Testing Using Katalon Studio,” Zbornik Radova Medunarodne Naučne Konferencije o Digitalnoj Ekonomiji DIEC, vol. 2, no. 2, pp. 87–97, 2019.
- A. Bruns, A. Kornstadt, and D. Wichmann, “Web Application Tests with Selenium,” IEEE software, vol. 26, no. 5, pp. 88–91, 2009.
- P. Ramya, V. Sindhura, and P. V. Sagar, “Testing Using Selenium Web Driver,” in 2017 Second International Conference on Electrical, Computer and Communication Technologies (ICECCT), pp. 1–7, IEEE, 2017.
- O. Shakurova, “Automating UI Tests for a Web Application Using Test-Complete,” 2015. BSc Thesis HAAGA-HELIA Ammattikorkeakoulu.
- S. Al-Zain, D. Eleyan, and J. Garfield, “Automated User Interface Testing for Web Applications and TestComplete,” in Proceedings of the CUBE International Information Technology Conference, pp. 350–354, 2012.
- S. Rapps and E. J. Weyuker, “Selecting Software Test Data using Data Flow Information,” IEEE Transactions on Software Engineering, no. 4, pp. 367–375, 1985.
- M. Gligoric, L. Eloussi, and D. Marinov, “Practical regression test selection with dynamic file dependencies,” in Proceedings of the 2015 International Symposium on Software Testing and Analysis, ISSTA 2015, (New York, NY, USA), p. 211–222, Association for Computing Machinery, 2015.
- O. Legunsen, F. Hariri, A. Shi, Y. Lu, L. Zhang, and D. Marinov, “An extensive study of static regression test selection in modern software evolution,” FSE 2016, (New York, NY, USA), p. 583–594, Association for Computing Machinery, 2016.
- L. Zhang, “Hybrid regression test selection,” ICSE ’18, (New York, NY, USA), p. 199–209, Association for Computing Machinery, 2018.
- D. Distefano, M. Fähndrich, F. Logozzo, and P. W. O’Hearn, “Scaling Static Analyses at Facebook,” Communications of the ACM, vol. 62, no. 8, pp. 62–70, 2019.
- M. Nayrolles and A. Hamou-Lhadj, “CLEVER: Combining Code Metrics with Clone Detection for Just-in-Time Fault Prevention and Resolution in Large Industrial Projects,” in Proceedings of the 15th International Conference on Mining Software Repositories (MSR), pp. 153–164, 2018.
- J. Śliwerski, T. Zimmermann, and A. Zeller, “When Do Changes Induce Fixes?,” ACM SIGSOFT Software Engineering Notes, vol. 30, no. 4, pp. 1–5, 2005.
- M. Kondo, D. M. German, O. Mizuno, and E.-H. Choi, “The Impact of Context Metrics on Just-in-Time Defect Prediction,” Empirical Software Engineering, vol. 25, no. 1, pp. 890–939, 2020.
- S. Tabassum, L. L. Minku, D. Feng, G. G. Cabral, and L. Song, “An Investigation of Cross-Project Learning in Online Just-in-Time Software Defect Prediction,” in 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE), pp. 554–565, IEEE, 2020.
- T.-H. Chen, M. Nagappan, E. Shihab, and A. E. Hassan, “An Empirical Study of Dormant Bugs,” in Proceedings of the 11th Working Conference on Mining Software Repositories (MSR), pp. 82–91, 2014.
- M. Felderer, C. Haisjackl, V. Pekar, and R. Breu, “An exploratory study on risk estimation in risk-based testing approaches,” vol. 200, 01 2016.
- M. Felderer, C. Haisjackl, R. Breu, and J. Motz, “Integrating manual and automatic risk assessment for risk-based testing,” vol. 94, pp. 159–180, 01 2012.
- N. N. Zolkifli, A. Ngah, and A. Deraman, “Version Control System: A Review,” Procedia Computer Science, vol. 135, pp. 408–415, 2018.
- C. Mills, J. Pantiuchina, E. Parra, G. Bavota, and S. Haiduc, “Are bug reports enough for text retrieval-based bug localization?,” pp. 381–392, 09 2018.
- K. Herzig, S. Just, and A. Zeller, “It’s not a bug, it’s a feature: How misclassification impacts bug prediction,” in Proceedings of the 2013 International Conference on Software Engineering, ICSE ’13, p. 392–401, IEEE Press, 2013.
- S. Herbold, A. Trautsch, F. Trautsch, and B. Ledel, “Problems with szz and features: An empirical study of the state of practice of defect prediction data collection,” Empirical Software Engineering, vol. 27, 03 2022.
- IEEE Press, 2021.
- S. Davies, M. Roper, and M. Wood, “Comparing text-based and dependence-based approaches for determining the origins of bugs,” Journal of Software: Evolution and Process, vol. 26, 01 2014.
- S. M. Lundberg, G. G. Erion, and S.-I. Lee, “Consistent individualized feature attribution for tree ensembles,” arXiv preprint arXiv:1802.03888, 2018.
- U. Alon, M. Zilberstein, O. Levy, and E. Yahav, “Code2vec: Learning distributed representations of code,” Proc. ACM Program. Lang., vol. 3, jan 2019.