Evaluating Search-Based Software Microbenchmark Prioritization (2211.13525v4)
Abstract: Ensuring that software performance does not degrade after a code change is paramount. A solution is to regularly execute software microbenchmarks, a performance testing technique similar to (functional) unit tests, which, however, often becomes infeasible due to extensive runtimes. To address that challenge, research has investigated regression testing techniques, such as test case prioritization (TCP), which reorder the execution within a microbenchmark suite to detect larger performance changes sooner. Such techniques are either designed for unit tests and perform sub-par on microbenchmarks or require complex performance models, drastically reducing their potential application. In this paper, we empirically evaluate single- and multi-objective search-based microbenchmark prioritization techniques to understand whether they are more effective and efficient than greedy, coverage-based techniques. For this, we devise three search objectives, i.e., coverage to maximize, coverage overlap to minimize, and historical performance change detection to maximize. We find that search algorithms (SAs) are only competitive with but do not outperform the best greedy, coverage-based baselines. However, a simple greedy technique utilizing solely the performance change history (without coverage information) is equally or more effective than the best coverage-based techniques while being considerably more efficient, with a runtime overhead of less than 1%. These results show that simple, non-coverage-based techniques are a better fit for microbenchmarks than complex coverage-based techniques.
- P. Achimugu, A. Selamat, R. Ibrahim, and M. N. Mahrin, “A systematic literature review of software requirements prioritization research,” Information and Software Technology, vol. 56, no. 6, 2014. [Online]. Available: https://doi.org/10.1016/j.infsof.2014.02.001
- N. Alshahwan, M. Harman, and A. Marginean, “Software testing research challenges: An industrial perspective,” in Proceedings of the 16th IEEE International Conference on Software Testing, Verification and Validation, ser. ICST 2023. IEEE, 2023. [Online]. Available: https://doi.org/10.1109/ICST57152.2023.00008
- D. Alshoaibi, K. Hannigan, H. Gupta, and M. W. Mkaouer, “PRICE: Detection of performance regression introducing code changes using static and dynamic metrics,” in Proceedings of the 11th International Symposium on Search Based Software Engineering, ser. SSBSE 2019. Springer Nature, 2019. [Online]. Available: https://doi.org/10.1007/978-3-030-27455-9_6
- D. Alshoaibi, M. W. Mkaouer, A. Ouni, A. Wahaishi, T. Desell, and M. Soui, “Search-based detection of code changes introducing performance regression,” Swarm and Evolutionary Computation, vol. 73, 2022. [Online]. Available: https://doi.org/10.1016/j.swevo.2022.101101
- A. Arcuri, “RESTful API automated test case generation with EvoMaster,” ACM Transactions on Software Engineering and Methodology, vol. 28, no. 1, 2019. [Online]. Available: https://doi.org/10.1145/3293455
- A. Arcuri and L. Briand, “A practical guide for using statistical tests to assess randomized algorithms in software engineering,” in Proceedings of the 33rd International Conference on Software Engineering, ser. ICSE 2011. ACM, 2011. [Online]. Available: https://doi.org/10.1145/1985793.1985795
- A. Auger, J. Bader, D. Brockhoff, and E. Zitzler, “Theory of the hypervolume indicator: Optimal µ-distributions and the choice of the reference point,” in Proceedings of the 10th ACM SIGEVO Workshop on Foundations of Genetic Algorithms, ser. FOGA 2009. ACM, 2009. [Online]. Available: https://doi.org/10.1145/1527125.1527138
- Y. Benjamini and D. Yekutieli, “The control of the false discovery rate in multiple testing under dependency,” The Annals of Statistics, vol. 29, no. 4, 2001. [Online]. Available: https://doi.org/10.1214/aos/1013699998
- J. Branke, K. Deb, H. Dierolf, and M. Osswald, “Finding knees in multi-objective optimization,” in Proceedings of the 8th International Conference on Parallel Problem Solving from Nature, ser. PPSN 2004. Springer, 2004.
- J. Chen and W. Shang, “An exploratory study of performance regression introducing code changes,” in Proceedings of the 33rd IEEE International Conference on Software Maintenance and Evolution, ser. ISCME 2017. IEEE, 2017. [Online]. Available: https://doi.org/10.1109/icsme.2017.13
- J. Chen, W. Shang, and E. Shihab, “PerfJIT: Test-level just-in-time prediction for performance regression introducing commits,” IEEE Transactions on Software Engineering, 2020. [Online]. Available: https://doi.org/10.1109%2Ftse.2020.3023955
- K. Chen, Y. Li, Y. Chen, C. Fan, Z. Hu, and W. Yang, “GLIB: Towards automated test oracle for graphically-rich applications,” in Proceedings of the 29th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ser. ESEC/FSE 2021. ACM, 2021. [Online]. Available: https://doi.org/10.1145/3468264.3468586
- D. Daly, “Creating a virtuous cycle in performance testing at MongoDB,” in Proceedings of the 12th ACM/SPEC International Conference on Performance Engineering, ser. ICPE 2021. ACM, 2021. [Online]. Available: https://doi.org/10.1145/3427921.3450234
- D. Daly, W. Brown, H. Ingo, J. O'Leary, and D. Bradford, “The use of change point detection to identify software performance regressions in a continuous integration system,” in Proceedings of the 11th ACM/SPEC International Conference on Performance Engineering, ser. ICPE 2020. ACM, 2020. [Online]. Available: https://doi.org/10.1145/3358960.3375791
- D. E. Damasceno Costa, C.-P. Bezemer, P. Leitner, and A. Andrzejak, “What’s wrong with my benchmark results? Studying bad practices in JMH benchmarks,” IEEE Transactions on Software Engineering, 2019.
- A. C. Davison and D. Hinkley, “Bootstrap methods and their application,” Journal of the American Statistical Association, vol. 94, 1997.
- A. B. de Oliveira, S. Fischmeister, A. Diwan, M. Hauswirth, and P. F. Sweeney, “Perphecy: Performance regression test selection made simple but effective,” in Proceedings of the 10th IEEE International Conference on Software Testing, Verification and Validation, ser. ICST 2017, 2017.
- K. Deb and H. Jain, “An evolutionary many-objective optimization algorithm using reference-point-based nondominated sorting approach, part I: Solving problems with box constraints,” IEEE Transactions on Evolutionary Computation, vol. 18, no. 4, 2014. [Online]. Available: https://doi.org/10.1109%2Ftevc.2013.2281535
- K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan, “A fast and elitist multiobjective genetic algorithm: NSGA-II,” IEEE Transactions on Evolutionary Computation, vol. 6, no. 2, 2002. [Online]. Available: https://doi.org/10.1109%2F4235.996017
- P. Delgado-Pérez, A. B. Sánchez, S. Segura, and I. Medina-Bulo, “Performance mutation testing,” Software Testing, Verification and Reliability, vol. 31, no. 5, 2020. [Online]. Available: https://doi.org/10.1002/stvr.1728
- X. Devroey, A. Gambi, J. P. Galeotti, R. Just, F. M. Kifetew, A. Panichella, and S. Panichella, “JUGE: An infrastructure for benchmarking Java unit test generators,” Software Testing, Verification and Reliability, vol. 33, no. 3, 2023. [Online]. Available: https://doi.org/10.1002/stvr.1838
- D. Di Nucci, A. Panichella, A. Zaidman, and A. De Lucia, “A test case prioritization genetic algorithm guided by the hypervolume indicator,” IEEE Transactions on Software Engineering, vol. 46, no. 6, 2020. [Online]. Available: https://doi.org/10.1109/tse.2018.2868082
- O. J. Dunn, “Multiple comparisons using rank sums,” Technometrics, vol. 6, no. 3, 1964. [Online]. Available: https://doi.org/10.1080/00401706.1964.10490181
- S. Elbaum, A. Malishevsky, and G. Rothermel, “Incorporating varying test costs and fault severities into test case prioritization,” in Proceedings of the 23rd International Conference on Software Engineering, ser. ICSE 2001. IEEE, 2001. [Online]. Available: https://doi.org/10.1109/icse.2001.919106
- S. Elbaum, G. Rothermel, and J. Penix, “Techniques for improving regression testing in continuous integration development environments,” in Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, ser. FSE 2014. ACM, 2014. [Online]. Available: http://doi.acm.org/10.1145/2635868.2635910
- D. Elsner, F. Hauer, A. Pretschner, and S. Reimer, “Empirically evaluating readily available information for regression test optimization in continuous integration,” in Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis, ser. ISSTA 2021. ACM, 2021. [Online]. Available: https://doi.org/10.1145/3460319.3464834
- M. G. Epitropakis, S. Yoo, M. Harman, and E. K. Burke, “Empirical evaluation of pareto efficient multi-objective regression test case prioritisation,” in Proceedings of the 2015 International Symposium on Software Testing and Analysis, ser. ISSTA 2015. ACM, 2015. [Online]. Available: https://doi.org/10.1145/2771783.2771788
- C. M. Fonseca and P. J. Fleming, “An overview of evolutionary algorithms in multiobjective optimization,” Evolutionary Computation, vol. 3, no. 1, 1995. [Online]. Available: https://doi.org/10.1162/evco.1995.3.1.1
- G. Fraser and A. Zeller, “Mutation-driven generation of unit tests and oracles,” in Proceedings of the 19th International Symposium on Software Testing and Analysis, ser. ISSTA 2010. ACM, 2010. [Online]. Available: https://doi.org/10.1145/1831708.1831728
- A. Georges, D. Buytaert, and L. Eeckhout, “Statistically rigorous Java performance evaluation,” in Proceedings of the 22nd ACM SIGPLAN Conference on Object-Oriented Programming, Systems, and Applications, ser. OOPSLA 2007. ACM, 2007. [Online]. Available: http://doi.acm.org/10.1145/1297027.1297033
- M. Grambow, C. Laaber, P. Leitner, and D. Bermbach, “Using application benchmark call graphs to quantify and improve the practical relevance of microbenchmark suites,” PeerJ Computer Science, vol. 7, 2021. [Online]. Available: https://doi.org/10.7717/peerj-cs.548
- M. Grambow, D. Kovalev, C. Laaber, P. Leitner, and D. Bermbach, “Using microbenchmark suites to detect application performance changes,” IEEE Transactions on Cloud Computing, 2022.
- A. Haghighatkhah, M. Mäntylä, M. Oivo, and P. Kuvaja, “Test prioritization in continuous integration environments,” The Journal of Systems and Software, vol. 146, 2018. [Online]. Available: https://doi.org/10.1016/j.jss.2018.08.061
- D. Hao, L. Zhang, L. Zhang, G. Rothermel, and H. Mei, “A unified test case prioritization approach,” ACM Transactions on Software Engineering and Methodology, vol. 24, no. 2, 2014. [Online]. Available: http://doi.acm.org/10.1145/2685614
- M. Harman, “The current state and future of search based software engineering,” in Future of Software Engineering, ser. FOSE 2007. IEEE, 2007. [Online]. Available: https://doi.org/10.1109/fose.2007.29
- M. Harman and B. F. Jones, “Search-based software engineering,” Information and Software Technology, vol. 43, no. 14, 2001. [Online]. Available: https://doi.org/10.1016/S0950-5849(01)00189-6
- S. He, G. Manns, J. Saunders, W. Wang, L. Pollock, and M. L. Soffa, “A statistics-based performance testing methodology for cloud applications,” in Proceedings of the 27th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ser. ESEC/FSE 2019. ACM, 2019. [Online]. Available: http://doi.acm.org/10.1145/3338906.3338912
- M. R. Hess and J. D. Kromrey, “Robust confidence intervals for effect sizes: A comparative study of cohen’s d and cliff’s delta under non-normality and heterogeneous variances,” Annual Meeting of the American Educational Research Association, 2004.
- T. C. Hesterberg, “What teachers should know about the bootstrap: Resampling in the undergraduate statistics curriculum,” The American Statistician, vol. 69, no. 4, 2015. [Online]. Available: https://doi.org/10.1080/00031305.2015.1089789
- P. Huang, X. Ma, D. Shen, and Y. Zhou, “Performance regression testing target prioritization via performance risk analysis,” in Proceedings of the 36th IEEE/ACM International Conference on Software Engineering, ser. ICSE 2014. ACM, 2014. [Online]. Available: http://doi.acm.org/10.1145/2568225.2568232
- M. M. Islam, A. Marchetto, A. Susi, and G. Scanniello, “A multi-objective technique to prioritize test cases based on latent semantic indexing,” in Proceedings of the 16th European Conference on Software Maintenance and Reengineering, ser. CSMR 2012. IEEE, 2012. [Online]. Available: https://doi.org/10.1109/csmr.2012.13
- M. Jangali, Y. Tang, N. Alexandersson, P. Leitner, J. Yang, and W. Shang, “Automated generation and evaluation of JMH microbenchmark suites from unit tests,” IEEE Transactions on Software Engineering, 2022. [Online]. Available: https://doi.org/10.1109/TSE.2022.3188005
- Z. M. Jiang and A. E. Hassan, “A survey on load testing of large-scale software systems,” IEEE Transactions on Software Engineering, vol. 41, no. 11, 2015. [Online]. Available: https://doi.org/10.1109/tse.2015.2445340
- R. Just, D. Jalali, and M. D. Ernst, “Defects4J: A database of existing faults to enable controlled testing studies for Java programs,” in Proceedings of the 2014 International Symposium on Software Testing and Analysis, ser. ISSTA 2014. ACM, 2014. [Online]. Available: https://doi.org/10.1145/2610384.2628055
- T. Kalibera and R. Jones, “Quantifying performance changes with effect size confidence intervals,” University of Kent, Technical Report 4–12, 2012. [Online]. Available: http://www.cs.kent.ac.uk/pubs/2012/3233
- ——, “Rigorous benchmarking in reasonable time,” in Proceedings of the 2013 ACM SIGPLAN International Symposium on Memory Management, ser. ISMM 2013. ACM, 2013. [Online]. Available: http://doi.acm.org/10.1145/2464157.2464160
- M. Kim, T. Hiroyasu, M. Miki, and S. Watanabe, “SPEA2+: Improving the performance of the strength pareto evolutionary algorithm 2,” in Proceedings of the 8th International Conference on Parallel Problem Solving from Nature, ser. PPSN 2004, vol. 3242. Springer, 2004. [Online]. Available: https://doi.org/10.1007%2F978-3-540-30217-9_75
- J. D. Knowles and D. Corne, “The Pareto aarchived evolution strategy: A new baseline algorithm for Pareto Multiobjective optimisation,” in Proceedings of the Congress on Evolutionary Computation, ser. CEC 1999. IEEE, 1999. [Online]. Available: https://doi.org/10.1109/cec.1999.781913
- W. H. Kruskal and W. A. Wallis, “Use of ranks in one-criterion variance analysis,” Journal of the American Statistical Association, vol. 47, no. 260, 1952. [Online]. Available: https://doi.org/10.1080/01621459.1952.10483441
- C. Laaber, “chrstphlbr/pa: v0.1.0,” Nov. 2022. [Online]. Available: https://doi.org/10.5281/zenodo.7308066
- C. Laaber and P. Leitner, “An evaluation of open-source software microbenchmark suites for continuous performance assessment,” in Proceedings of the 15th International Conference on Mining Software Repositories, ser. MSR 2018. ACM, 2018. [Online]. Available: http://doi.acm.org/10.1145/3196398.3196407
- C. Laaber and S. Würsten, “chrstphlbr/bencher: Release v0.4.0,” Jan. 2024. [Online]. Available: https://doi.org/10.5281/zenodo.10527360
- C. Laaber, J. Scheuner, and P. Leitner, “Software microbenchmarking in the cloud. How bad is it really?” Empirical Software Engineering, vol. 24, 2019. [Online]. Available: https://doi.org/10.1007/s10664-019-09681-1
- C. Laaber, S. Würsten, H. C. Gall, and P. Leitner, “Dynamically reconfiguring software microbenchmarks: Reducing execution time without sacrificing result quality,” in Proceedings of the 28th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ser. ESEC/FSE 2020. ACM, 2020. [Online]. Available: https://doi.org/10.1145/3368089.3409683
- C. Laaber, M. Basmaci, and P. Salza, “Predicting unstable software benchmarks using static source code features,” Empirical Software Engineering, vol. 26, no. 6, 2021. [Online]. Available: https://doi.org/10.1007/s10664-021-09996-y
- C. Laaber, H. C. Gall, and P. Leitner, “Applying test case prioritization to software microbenchmarks,” Empirical Software Engineering, vol. 26, no. 6, 2021. [Online]. Available: https://doi.org/10.1007/s10664-021-10037-x
- ——, “Replication package "Applying test case prioritization to software microbenchmarks",” 2021. [Online]. Available: https://doi.org/10.5281/zenodo.5206117
- C. Laaber, T. Yue, and S. Ali, “Replication package "Evaluating search-based software microbenchmark prioritization",” 2024. [Online]. Available: https://doi.org/10.5281/zenodo.10527125
- P. Leitner and C.-P. Bezemer, “An exploratory study of the state of practice of performance testing in Java-based open source projects,” in Proceedings of the 8th ACM/SPEC on International Conference on Performance Engineering, ser. ICPE 2017. ACM, 2017. [Online]. Available: http://doi.acm.org/10.1145/3030207.3030213
- Z. Li, M. Harman, and R. M. Hierons, “Search algorithms for regression test case prioritization,” IEEE Transactions on Software Engineering, vol. 33, no. 4, 2007. [Online]. Available: https://doi.org/10.1109/TSE.2007.38
- Z. Li, Y. Bian, R. Zhao, and J. Cheng, “A fine-grained parallel multi-objective test case prioritization on GPU,” in Proceedings of the 5th Symposium on Search Based Software Engineering, ser. SSBSE 2013. Springer, 2013. [Online]. Available: https://doi.org/10.1007/978-3-642-39742-4_10
- J. Liang, S. Elbaum, and G. Rothermel, “Redefining prioritization: Continuous prioritization for continuous integration,” in Proceedings of the 40th IEEE/ACM International Conference on Software Engineering, ser. ICSE 2018. ACM, 2018. [Online]. Available: http://doi.acm.org/10.1145/3180155.3180213
- Q. Luo, K. Moran, and D. Poshyvanyk, “A large-scale empirical comparison of static and dynamic test case prioritization techniques,” in Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, ser. FSE 2016. ACM, 2016. [Online]. Available: http://doi.acm.org/10.1145/2950290.2950344
- Q. Luo, K. Moran, L. Zhang, and D. Poshyvanyk, “How do static and dynamic test case prioritization techniques perform on modern software systems? An extensive study on GitHub projects,” IEEE Transactions on Software Engineering, vol. 45, no. 11, 2019. [Online]. Available: https://doi.org/10.1109/tse.2018.2822270
- A. Marchetto, M. M. Islam, W. Asghar, A. Susi, and G. Scanniello, “A multi-objective technique to prioritize test cases,” IEEE Transactions on Software Engineering, vol. 42, no. 10, 2016. [Online]. Available: https://doi.org/10.1109/tse.2015.2510633
- A. Maricq, D. Duplyakin, I. Jimenez, C. Maltzahn, R. Stutsman, and R. Ricci, “Taming performance variability,” in Proceedings of the 13th USENIX Conference on Operating Systems Design and Implementation, ser. OSDI 2018. USENIX Association, 2018. [Online]. Available: https://www.usenix.org/conference/osdi18/presentation/maricq
- S. Mostafa, X. Wang, and T. Xie, “PerfRanker: Prioritization of performance regression tests for collection-intensive software,” in Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis, ser. ISSTA 2017. ACM, 2017. [Online]. Available: http://doi.acm.org/10.1145/3092703.3092725
- A. J. Nebro, J. J. Durillo, F. Luna, B. Dorronsoro, and E. Alba, “MOCell: A cellular genetic algorithm for multiobjective optimization,” International Journal of Intelligent Systems, vol. 24, no. 7, 2009. [Online]. Available: https://doi.org/10.1002/int.20358
- M. Pradel, M. Huggler, and T. R. Gross, “Performance regression testing of concurrent classes,” in Proceedings of the 2014 International Symposium on Software Testing and Analysis, ser. ISSTA 2014. ACM, 2014. [Online]. Available: http://doi.acm.org/10.1145/2610384.2610393
- S. Ren, H. Lai, W. Tong, M. Aminzadeh, X. Hou, and S. Lai, “Nonparametric bootstrapping for hierarchical data,” Journal of Applied Statistics, vol. 37, no. 9, 2010. [Online]. Available: https://doi.org/10.1080/02664760903046102
- G. Rothermel and M. J. Harrold, “Empirical studies of a safe regression test selection technique,” IEEE Transactions on Software Engineering, vol. 24, no. 6, 1998.
- G. Rothermel, R. H. Untch, C. Chu, and M. J. Harrold, “Test case prioritization: An empirical study,” in Proceedings of the IEEE International Conference on Software Maintenance, ser. ICSM 1999. IEEE, 1999. [Online]. Available: https://doi.org/10.1109/icsm.1999.792604
- G. Rothermel, R. J. Untch, and C. Chu, “Prioritizing test cases for regression testing,” IEEE Transactions on Software Engineering, vol. 27, no. 10, 2001. [Online]. Available: https://doi.org/10.1109/32.962562
- H. Samoaa and P. Leitner, “An exploratory study of the impact of parameterization on JMH measurement results in open-source projects,” in Proceedings of the 12th ACM/SPEC International Conference on Performance Engineering, ser. ICPE 2021. ACM, 2021. [Online]. Available: https://doi.org/10.1145/3427921.3450243
- P. Stefan, V. Horký, L. Bulej, and P. Tůma, “Unit testing performance in Java projects: Are we there yet?” in Proceedings of the 8th ACM/SPEC on International Conference on Performance Engineering, ser. ICPE 2017. ACM, 2017. [Online]. Available: http://doi.acm.org/10.1145/3030207.3030226
- K.-J. Stol and B. Fitzgerald, “The ABC of software engineering research,” ACM Transactions on Software Engineering and Methodology, vol. 27, no. 3, 2018. [Online]. Available: https://doi.org/10.1145/3241743
- L. Traini, D. Di Pompeo, M. Tucci, B. Lin, S. Scalabrino, G. Bavota, M. Lanza, R. Oliveto, and V. Cortellessa, “How software refactoring impacts execution time,” ACM Transactions on Software Engineering and Methodology, vol. 31, no. 2, 2022. [Online]. Available: https://doi.org/10.1145/3485136
- L. Traini, V. Cortellessa, D. Di Pompeo, and M. Tucci, “Towards effective assessment of steady state performance in Java software: Are we there yet?” Empirical Software Engineering, vol. 28, no. 13, 2023. [Online]. Available: https://doi.org/10.1007/s10664-022-10247-x
- A. Vargha and H. D. Delaney, “A critique and improvement of the "CL" common language effect size statistics of McGraw and Wong,” Journal of Educational and Behavioral Statistics, vol. 25, no. 2, 2000. [Online]. Available: https://doi.org/10.2307/1165329
- S. Yoo and M. Harman, “Regression testing minimization, selection and prioritization: A survey,” Software: Testing, Verification and Reliability, vol. 22, no. 2, 2012. [Online]. Available: http://dx.doi.org/10.1002/stv.430
- ——, “Pareto efficient multi-objective test case selection,” in Proceedings of the 2007 International Symposium on Software Testing and Analysis, ser. ISSTA 2007. ACM, 2007. [Online]. Available: https://doi.org/10.1145/1273463.1273483
- T. Yu and M. Pradel, “Pinpointing and repairing performance bottlenecks in concurrent programs,” Empirical Software Engineering, vol. 23, no. 5, 2017. [Online]. Available: https://doi.org/10.1007/s10664-017-9578-1
- H. Zhang, M. Zhang, T. Yue, S. Ali, and Y. Li, “Uncertainty-wise requirements prioritization with search,” ACM Transactions on Software Engineering and Methodology, vol. 30, no. 1, 2020. [Online]. Available: https://doi.org/10.1145/3408301
- E. Zitzler and S. Künzli, “Indicator-based selection in multiobjective search,” in Proceedings of the 8th International Conference on Parallel Problem Solving from Nature, ser. PPSN 2004, vol. 3242. Springer, 2004. [Online]. Available: https://doi.org/10.1007/978-3-540-30217-9_84
- E. Zitzler, M. Laumanns, and L. Thiele, “Spea2: Improving the strength pareto evolutionary algorithm,” ETH Zurich, Computer Engineering and Networks Laboratory, TIK Report 103, 2001. [Online]. Available: https://doi.org/10.3929/ETHZ-A-004284029