Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Novel ML-driven Test Case Selection Approach for Enhancing the Performance of Grammatical Evolution (2312.14321v1)

Published 21 Dec 2023 in cs.NE

Abstract: Computational cost in metaheuristics such as Evolutionary Algorithms (EAs) is often a major concern, particularly with their ability to scale. In data-based training, traditional EAs typically use a significant portion, if not all, of the dataset for model training and fitness evaluation in each generation. This makes EAs suffer from high computational costs incurred during the fitness evaluation of the population, particularly when working with large datasets. To mitigate this issue, we propose a Machine Learning (ML)-driven Distance-based Selection (DBS) algorithm that reduces the fitness evaluation time by optimizing test cases. We test our algorithm by applying it to 24 benchmark problems from Symbolic Regression (SR) and digital circuit domains and then using Grammatical Evolution (GE) to train models using the reduced dataset. We use GE to test DBS on SR and produce a system flexible enough to test it on digital circuit problems further. The quality of the solutions is tested and compared against the conventional training method to measure the coverage of training data selected using DBS, i.e., how well the subset matches the statistical properties of the entire dataset. Moreover, the effect of optimized training data on run time and the effective size of the evolved solutions is analyzed. Experimental and statistical evaluations of the results show our method empowered GE to yield superior or comparable solutions to the baseline (using the full datasets) with smaller sizes and demonstrates computational efficiency in terms of speed.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (49)
  1. Advances in genetic programming, volume 3. MIT press, 1994.
  2. Parallelization of a genetic algorithm for curve fitting chaotic dynamical systems. In Parallel Computational Fluid Dynamics 2002, pages 563–570. Elsevier, 2003.
  3. Kevin Duffy-Deno. The Curse of Big Data — bintel.io. https://www.bintel.io/blog/the-curse-of-big-data, May 2021. [Accessed 30-Oct-2022].
  4. Automated grammar-based feature selection in symbolic regression. In Proceedings of the Genetic and Evolutionary Computation Conference, pages 902–910, 2022.
  5. Hierarchical Clustering Driven Test Case Selection in Digital Circuits. In Proceedings of the 16th International Conference on Software Technologies - ICSOFT, pages 589–596. SciTePress, 2021.
  6. Predive: preserving diversity in test cases for evolving digital circuits using grammatical evolution. In Proceedings of the Genetic and Evolutionary Computation Conference Companion, pages 719–722, 2022.
  7. Insights into incorporating trustworthiness and ethics in ai systems with explainable ai. International Journal of Natural Computing Research (IJNCR), 11(1):1–23, 2022.
  8. Insights into the Advancements of Artificial Intelligence and Machine Learning, the Present State of Art, and Future Prospects: Seven Decades of Digital Revolution. In Suresh Chandra Satapathy, Vikrant Bhateja, Margarita N Favorskaya, and T Adilakshmi, editors, Smart Computing Techniques and Applications, pages 609–621, Singapore, 2021. Springer Singapore.
  9. Symbolic regression driven by training data and prior knowledge. In Proceedings of the 2020 Genetic and Evolutionary Computation Conference, pages 958–966, 2020.
  10. Instance selection of linear complexity for big data. Knowledge-Based Systems, 107:83–95, 2016.
  11. Instance selection with neural networks for regression problems. In International Conference on Artificial Neural Networks, pages 263–270. Springer, 2012.
  12. Multi-objective evolutionary instance selection for regression tasks. Entropy, 20(10):746, 2018.
  13. Data reduction for instance-based learning using entropy-based partitioning. In International Conference on Computational Science and Its Applications, pages 590–599. Springer, 2006.
  14. Training set selection using entropy based distance. In 2011 IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT), pages 1–5. IEEE, 2011.
  15. Cluster integration for the cluster-based instance selection. In International Conference on Computational Collective Intelligence, pages 353–362. Springer, 2010.
  16. Ireneusz Czarnowski. Cluster-based instance selection for machine classification. Knowledge and Information Systems, 30:113–133, 2012.
  17. An approach to instance reduction in supervised learning. In International Conference on Innovative Techniques and Applications of Artificial Intelligence, pages 267–280. Springer, 2003.
  18. Reduction techniques for instance-based learning algorithms. Machine learning, 38:257–286, 2000.
  19. Feature selection to improve generalization of genetic programming for high-dimensional symbolic regression. IEEE Transactions on Evolutionary Computation, 21(5):792–806, 2017.
  20. Instance selection for regression: Adapting DROP. Neurocomputing, 201:66–81, 2016.
  21. Mohammed Ferdjallah. Introduction to digital systems: modeling, synthesis, and simulation using VHDL. John Wiley & Sons, 2011.
  22. Verilog hdl simulator technology: a survey. Journal of Electronic Testing, 30(3):255–269, 2014.
  23. Automatic Test Case Generation for Prime Field Elliptic Curve Cryptographic Circuits. In 2021 IEEE 17th International Colloquium on Signal Processing & Its Applications (CSPA), pages 121–126. IEEE, 3 2021.
  24. Automatic test case generation for vulnerability analysis of galois field arithmetic circuits. In 2021 IEEE 5th International Conference on Cryptography, Security and Privacy, CSP 2021, pages 32–37, 2021.
  25. Reconfigurable system design and verification. CRC Press, 2018.
  26. Ann Steffora Mutschler. Yield Ramp Challenges Increase, 12 2014.
  27. Software Testing: A Research Travelogue (2000–2014). In Future of Software Engineering Proceedings, FOSE 2014, pages 117–132, New York, NY, USA, 2014. Association for Computing Machinery.
  28. Antirandom test vectors for bist in hardware/software systems. Fundamenta Informaticae, 119(2):163–185, 2012.
  29. Pseudo-exhaustive testing for software. Proceedings of the 30th Annual IEEE/NASA Software Engineering Workshop, SEW-30, pages 153–158, 2006.
  30. Heuristic approach to optimize the number of test cases for simple circuits. arXiv preprint arXiv:1009.6186, 2010.
  31. Training digital circuits with hamming clustering. IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications, 47(4):513–527, 2000.
  32. Essentials of Electronic Testing for Digital, Memory and Mixed-Signal VLSI Circuits, volume 17 of Frontiers in Electronic Testing. Springer US, Boston, MA, 2002.
  33. Optimal Controlled Random Tests. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 10244 LNCS:27–38, 2017.
  34. Grammatical evolution: Evolving programs for an arbitrary language. In European Conference on Genetic Programming, pages 83–96, 1998.
  35. Handbook of grammatical evolution. Handbook of Grammatical Evolution, pages 1–497, 1 2018.
  36. Seeding grammars in grammatical evolution to improve search-based software testing. SN Computer Science, 2(4):1–19, 2021.
  37. Time is on the side of grammatical evolution. In 2021 International Conference on Computer Communication and Informatics (ICCCI), pages 1–7, 2021.
  38. Evolutionary computing based analysis of diversity in grammatical evolution. In 2021 International Conference on Artificial Intelligence and Smart Systems (ICAIS), pages 1688–1693, 2021.
  39. IBM. Clustering binary data with K-Means (should be avoided) — ibm.com. https://www.ibm.com/support/pages/clustering-binary-data-k-means-should-be-avoided, Apr 2020. [Accessed 11-Oct-2022].
  40. Hierarchical clustering. In Handbook of cluster analysis, pages 124–145. Chapman and Hall/CRC, 2015.
  41. Evaluation framework of hierarchical clustering methods for binary data. In 2012 12th International Conference on Hybrid Intelligent Systems (HIS), pages 421–426. IEEE, 2012.
  42. Introduction to information retrieval. Cambridge university press, 2008.
  43. GELAB – the cutting edge of grammatical evolution. IEEE Access, 10:38694–38708, 2022.
  44. Autoge: A tool for estimation of grammatical evolution models. In Proceedings of the 13th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,, pages 1274–1281. INSTICC, SciTePress, 2021.
  45. Grape: Grammatical algorithms in python for evolution. Signals, 3(3):642–663, 2022.
  46. Genetic programming needs better benchmarks. In Proceedings of the 14th annual conference on Genetic and evolutionary computation, pages 791–798, 2012.
  47. Analysing symbolic regression benchmarks under a meta-learning approach. In Proceedings of the Genetic and Evolutionary Computation Conference Companion, GECCO ’18, page 1342–1349, New York, NY, USA, 2018. Association for Computing Machinery.
  48. Towards automatic grammatical evolution for real-world symbolic regression. In Proceedings of the 13th International Joint Conference on Computational Intelligence - Volume 1: ECTA,, pages 68–78. INSTICC, SciTePress, 2021.
  49. libGE, 2006. for version 0.27alpha1, 14 September 2006.

Summary

We haven't generated a summary for this paper yet.