Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
166 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

On The Nature Of The Phenotype In Tree Genetic Programming (2402.08011v1)

Published 12 Feb 2024 in cs.NE

Abstract: In this contribution, we discuss the basic concepts of genotypes and phenotypes in tree-based GP (TGP), and then analyze their behavior using five benchmark datasets. We show that TGP exhibits the same behavior that we can observe in other GP representations: At the genotypic level trees show frequently unchecked growth with seemingly ineffective code, but on the phenotypic level, much smaller trees can be observed. To generate phenotypes, we provide a unique technique for removing semantically ineffective code from GP trees. The approach extracts considerably simpler phenotypes while not being limited to local operations in the genotype. We generalize this transformation based on a problem-independent parameter that enables a further simplification of the exact phenotype by coarse-graining to produce approximate phenotypes. The concept of these phenotypes (exact and approximate) allows us to clarify what evolved solutions truly predict, making GP models considered at the phenotypic level much better interpretable.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (78)
  1. Genetic programming for automatic skin cancer image classification. Expert Systems with Applications, 197:116680, 2022. ISSN 0957-4174. doi:https://doi.org/10.1016/j.eswa.2022.116680. URL https://www.sciencedirect.com/science/article/pii/S0957417422001634.
  2. Bloat control operators and diversity in genetic programming: A comparative study. Evolutionary Computation, 18(2):305–332, 2010.
  3. Lee Altenberg. The Evolution of Evolvability in Genetic Programming. In Advances in Genetic Programming, Volume 1, pages 47–74. The MIT Press, 01 1994. ISBN 9780262277181. doi:10.7551/mitpress/1108.003.0008. URL https://doi.org/10.7551/mitpress/1108.003.0008.
  4. Peter J. Angeline. Genetic Programming and Emergent Intelligence, page 75–97. MIT Press, Cambridge, MA, USA, 1994. ISBN 0262111888.
  5. Genetic programming for human oral bioavailability of drugs. GECCO ’06, page 255–262, New York, NY, USA, 2006. Association for Computing Machinery. ISBN 1595931864. doi:10.1145/1143997.1144042. URL https://doi.org/10.1145/1143997.1144042.
  6. Denser: deep evolutionary network structured representation. Genetic Programming and Evolvable Machines, 20(1):5–35, March 1 2019. ISSN 1573-7632. doi:10.1007/s10710-018-9339-y. URL https://doi.org/10.1007/s10710-018-9339-y.
  7. General purpose optimization library (gpol): A flexible and efficient multi-purpose optimization library in python. Applied Sciences, 11(11), 2021. ISSN 2076-3417. doi:10.3390/app11114774. URL https://www.mdpi.com/2076-3417/11/11/4774.
  8. Full-reference image quality expression via genetic programming. IEEE Transactions on Image Processing, 32:1458–1473, 2023a. doi:10.1109/TIP.2023.3244662.
  9. Semantic segmentation network stacking with genetic programming. Genetic Programming and Evolvable Machines, 24(2):Article number: 15, December 2023b. ISSN 1389-2576. doi:doi:10.1007/s10710-023-09464-0. URL https://rdcu.be/drZeF. Special Issue on Highlights of Genetic Programming 2022 Events.
  10. Some considerations on the reason for bloat. Genetic Programming and Evolvable Machines, 3:81–91, 2002.
  11. Genetic Programming: An Introduction: On the Automatic Evolution of Computer Programs and Its Applications. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1998. ISBN 155860510X.
  12. How the combinatorics of neutral spaces leads genetic programming to discover simple solutions. Genetic Programming Theory and Practice XX, pages 1–22, 2024.
  13. Genetic programming and redundancy. In J. Hopf, editor, Genetic Algorithms within the Framework of Evolutionary Computation (Workshop at KI-94, Saarbrücken), pages 33–38, Im Stadtwald, Building 44, D-66123 Saarbrücken, Germany, 1994. Max-Planck-Institut für Informatik (MPI-I-94-241). URL http://www.tik.ee.ethz.ch/~tec/publications/bt94/GPandRedundancy.ps.gz.
  14. M.F. Brameier and W. Banzhaf. Linear Genetic Programming. Genetic and Evolutionary Computation. Springer US, 2007. ISBN 9780387310305. URL https://books.google.com/books?id=AhZJ9SIChnQC.
  15. Richard Boland C. Non-coding rna: It’s not junk. Digestive Diseases and Sciences, 62(4):1107–1109, 2017. doi:10.1007/s10620-017-4506-1. URL https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5433430/.
  16. Gsgp-c++ 2.0: A geometric semantic genetic programming framework. SoftwareX, 10:100313, 2019. ISSN 2352-7110. doi:https://doi.org/10.1016/j.softx.2019.100313. URL https://www.sciencedirect.com/science/article/pii/S2352711019301736.
  17. Experiments in evolutionary image enhancement with elaine. Genetic Programming and Evolvable Machines, 23(4):557–579, dec 2022. ISSN 1389-2576. doi:10.1007/s10710-022-09445-9. URL https://doi.org/10.1007/s10710-022-09445-9.
  18. Evolving image enhancement pipelines. In Juan Romero, Tiago Martins, and Nereida Rodríguez-Fernández, editors, Artificial Intelligence in Music, Sound, Art and Design, pages 82–97, Cham, 2021. Springer International Publishing. ISBN 978-3-030-72914-1.
  19. Evolutionary design of explainable algorithms for biomedical image segmentation. Nature Communications, 14(1):7112, 2023. ISSN 2041-1723. doi:10.1038/s41467-023-42664-x. URL https://doi.org/10.1038/s41467-023-42664-x.
  20. Artificial gene regulatory networks—a review. Artificial life, 24(4):296–328, 2019.
  21. It is time for new perspectives on how to fight bloat in gp. Genetic Programming Theory and Practice XVII, pages 25–38, 2020a.
  22. Time and individual duration in genetic programming. IEEE Access, 8:38692–38713, 2020b.
  23. Dominique de Vienne. What is a phenotype? history and new developments of the concept. Genetica, 150(3):153–158, 2022.
  24. Generalisation of the limiting distribution of program sizes in tree-based genetic programming and analysis of its effects on bloat. In Proceedings of the 9th Annual Conference on Genetic and Evolutionary Computation, GECCO ’07, page 1588–1595, New York, NY, USA, 2007. Association for Computing Machinery. ISBN 9781595936974. doi:10.1145/1276958.1277277. URL https://doi.org/10.1145/1276958.1277277.
  25. Crossover, sampling, bloat and the harmful effects of size limits. In Michael O’Neill, Leonardo Vanneschi, Steven Gustafson, Anna Isabel Esparcia Alcázar, Ivanoe De Falco, Antonio Della Cioppa, and Ernesto Tarantino, editors, Genetic Programming, pages 158–169, Berlin, Heidelberg, 2008a. Springer Berlin Heidelberg. ISBN 978-3-540-78671-9.
  26. Operator equalisation and bloat free gp. In European Conference on Genetic Programming, 2008b. URL https://api.semanticscholar.org/CorpusID:1863814.
  27. What’s inside the black-box? a genetic programming method for interpreting complex machine learning models. GECCO ’19, page 1012–1020, New York, NY, USA, 2019. Association for Computing Machinery. ISBN 9781450361118. doi:10.1145/3321707.3321726. URL https://doi.org/10.1145/3321707.3321726.
  28. Walter Gilbert. Why genes in pieces? Nature, 271, 1978. ISSN 1476-4687. doi:10.1038/271501a0. URL https://doi.org/10.1038/271501a0.
  29. Semantic learning machine: A feedforward neural network construction algorithm inspired by geometric semantic genetic programming. In Francisco C. Pereira, Penousal Machado, Ernesto Costa, and Amilcar Cardoso, editors, Progress in Artificial Intelligence - 17th Portuguese Conference on Artificial Intelligence, EPIA 2015, volume 9273 of Lecture Notes in Computer Science, pages 280–285, Coimbra, Portugal, September 8-11 2015. Springer. doi:doi:10.1007/978-3-319-23485-4_28. URL http://dx.doi.org/10.1007/978-3-319-23485-4.
  30. Hedonic housing prices and the demand for clean air. Journal of Environmental Economics and Management, 5(1):81–102, 1978. ISSN 0095-0696. doi:https://doi.org/10.1016/0095-0696(78)90006-2. URL https://www.sciencedirect.com/science/article/pii/0095069678900062.
  31. Active learning informs symbolic regression model development in genetic programming. In Proceedings of the Companion Conference on Genetic and Evolutionary Computation, GECCO ’23 Companion, page 587–590, New York, NY, USA, 2023. Association for Computing Machinery. ISBN 9798400701207. doi:10.1145/3583133.3590577. URL https://doi.org/10.1145/3583133.3590577.
  32. A multi-dimensional genetic programming approach for multi-class classification problems. In Miguel Nicolau, Krzysztof Krawiec, Malcolm I. Heywood, Mauro Castelli, Pablo García-Sánchez, Juan J. Merelo, Victor M. Rivas Santos, and Kevin Sim, editors, Genetic Programming, pages 48–60, Berlin, Heidelberg, 2014. Springer Berlin Heidelberg. ISBN 978-3-662-44303-3.
  33. Cross-domain reuse of extracted knowledge in genetic programming for image classification. IEEE Transactions on Evolutionary Computation, 21(4):569–587, 2017. doi:10.1109/TEVC.2017.2657556.
  34. David Jackson. The identification and exploitation of dormancy in genetic programming. Genetic Programming and Evolvable Machines, 11(1):89–121, mar 2010. ISSN 1389-2576. doi:10.1007/s10710-009-9086-1. URL https://doi.org/10.1007/s10710-009-9086-1.
  35. Simplification of genetic programs: a literature survey. Data Mining and Knowledge Discovery, 36(4):1279–1300, 07 2022. doi:10.1007/s10618-022-00830-7. URL https://doi.org/10.1007/s10618-022-00830-7.
  36. Numerical simplification for bloat control and analysis of building blocks in genetic programming. Evolutionary Intelligence, 2:151–168, 2009.
  37. The plausibility of life: Resolving Darwin’s dilemma. Yale University Press, 2005.
  38. J.R. Koza. Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, 1992.
  39. Epsilon-lexicase selection for regression. In Proceedings of the Genetic and Evolutionary Computation Conference 2016, GECCO ’16, page 741–748, New York, NY, USA, 2016. Association for Computing Machinery. ISBN 9781450342063. doi:10.1145/2908812.2908898. URL https://doi.org/10.1145/2908812.2908898.
  40. Multidimensional genetic programming for multiclass classification. Swarm and Evolutionary Computation, 44:260–272, 2019. ISSN 2210-6502. doi:https://doi.org/10.1016/j.swevo.2018.03.015. URL https://www.sciencedirect.com/science/article/pii/S2210650217309136.
  41. W. B. Langdon and R. Poli. Fitness causes bloat. In P. K. Chawdhry, R. Roy, and R. K. Pant, editors, Soft Computing in Engineering Design and Manufacturing, pages 13–22, London, 1998a. Springer London. ISBN 978-1-4471-0427-8.
  42. W. B. Langdon and R. Poli. Fitness causes bloat: Mutation. In Wolfgang Banzhaf, Riccardo Poli, Marc Schoenauer, and Terence C. Fogarty, editors, Genetic Programming, pages 37–48, Berlin, Heidelberg, 1998b. Springer Berlin Heidelberg. ISBN 978-3-540-69758-9.
  43. The Evolution of Size and Shape. In Advances in Genetic Programming, Volume 3. The MIT Press, 07 1999. ISBN 9780262284127. doi:10.7551/mitpress/1110.003.0012. URL https://doi.org/10.7551/mitpress/1110.003.0012.
  44. Foundations of Genetic Programming. 2002. URL https://api.semanticscholar.org/CorpusID:13348347.
  45. Genetic programming for manifold learning: Preserving local topology. IEEE Transactions on Evolutionary Computation, 26(4):661–675, 2022. doi:10.1109/TEVC.2021.3106672.
  46. Gerald Litwack. Chapter 10 - nucleic acids and molecular genetics. In Gerald Litwack, editor, Human Biochemistry (Second Edition), pages 287–356. Academic Press, Boston, second edition edition, 2022. ISBN 978-0-323-85718-5. doi:https://doi.org/10.1016/B978-0-323-85718-5.00010-8. URL https://www.sciencedirect.com/science/article/pii/B9780323857185000108.
  47. Fighting bloat with nonparametric parsimony pressure. In Juan Julián Merelo Guervós, Panagiotis Adamidis, Hans-Georg Beyer, Hans-Paul Schwefel, and José-Luis Fernández-Villacañas, editors, Parallel Problem Solving from Nature — PPSN VII, pages 411–421, Berlin, Heidelberg, 2002. Springer Berlin Heidelberg. ISBN 978-3-540-45712-1.
  48. The role of explainability in creating trustworthy artificial intelligence for health care: A comprehensive survey of the terminology, design choices, and evaluation strategies. Journal of Biomedical Informatics, 113:103655, 2021. ISSN 1532-0464. doi:https://doi.org/10.1016/j.jbi.2020.103655. URL https://www.sciencedirect.com/science/article/pii/S1532046420302835.
  49. Accurate replication in genetic programming. In Proceedings of the 6th International Conference on Genetic Algorithms, page 303–309, San Francisco, CA, USA, 1995. Morgan Kaufmann Publishers Inc. ISBN 1558603700.
  50. When will a genetic algorithm outperform hill climbing. In J. Cowan, G. Tesauro, and J. Alspector, editors, Advances in Neural Information Processing Systems, volume 6. Morgan-Kaufmann, 1993. URL https://proceedings.neurips.cc/paper_files/paper/1993/file/ab88b15733f543179858600245108dd8-Paper.pdf.
  51. Geometric semantic genetic programming. In Carlos A. Coello Coello, Vincenzo Cutello, Kalyanmoy Deb, Stephanie Forrest, Giuseppe Nicosia, and Mario Pavone, editors, Parallel Problem Solving from Nature - PPSN XII, pages 21–31, Berlin, Heidelberg, 2012. Springer Berlin Heidelberg. ISBN 978-3-642-32937-1.
  52. A new method for simplifying algebraic expressions in genetic programming called equivalent decision simplification. In SCIS & ISIS SCIS & ISIS 2008, pages 1671–1676. Japan Society for Fuzzy Theory and Intelligent Informatics, 2008.
  53. M3gp – multiclass classification with gp. In Penousal Machado, Malcolm I. Heywood, James McDermott, Mauro Castelli, Pablo García-Sánchez, Paolo Burelli, Sebastian Risi, and Kevin Sim, editors, Genetic Programming, pages 78–91, Cham, 2015. Springer International Publishing. ISBN 978-3-319-16501-1.
  54. A filter approach to multiple feature construction for symbolic learning classifiers using genetic programming. IEEE Transactions on Evolutionary Computation, 16(5):645–661, 2012. doi:10.1109/TEVC.2011.2166158.
  55. Semantic approximation for reducing code bloat in genetic programming. Swarm and Evolutionary Computation, 58:100729, 2020.
  56. Peter Nordin and W. Banzhaf. Complexity compression and evolution. In International Conference on Genetic Algorithms, pages 310–317, 1995. URL https://api.semanticscholar.org/CorpusID:16415863.
  57. Explicitly Defined Introns and Destructive Crossover in Genetic Programming. In Advances in Genetic Programming, Volume 2. The MIT Press, 10 1996. ISBN 9780262290791. doi:10.7551/mitpress/1109.003.0010. URL https://doi.org/10.7551/mitpress/1109.003.0010.
  58. TPOT: A Tree-Based Pipeline Optimization Tool for Automating Machine Learning, pages 151–160. Springer International Publishing, Cham, 2019. ISBN 978-3-030-05318-5. doi:10.1007/978-3-030-05318-5_8. URL https://doi.org/10.1007/978-3-030-05318-5_8.
  59. Riccardo Poli. A simple but theoretically-motivated method to control bloat in genetic programming. In European Conference on Genetic Programming, pages 204–217. Springer, 2003.
  60. Elitism reduces bloat in genetic programming. In Proceedings of the 10th Annual Conference on Genetic and Evolutionary Computation, GECCO ’08, page 1343–1344, New York, NY, USA, 2008. Association for Computing Machinery. ISBN 9781605581309. doi:10.1145/1389095.1389355. URL https://doi.org/10.1145/1389095.1389355.
  61. Peter Rockett. Pruning of genetic programming trees using permutation tests. Evolutionary Intelligence, 13(4):649–661, 2020. doi:doi:10.1007/s12065-020-00379-8. URL https://rdcu.be/cU470.
  62. Sara Silva. Reassembling operator equalisation: A secret revealed. In Proceedings of the 13th Annual Conference on Genetic and Evolutionary Computation, GECCO ’11, page 1395–1402, New York, NY, USA, 2011. Association for Computing Machinery. ISBN 9781450305570. doi:10.1145/2001576.2001764. URL https://doi.org/10.1145/2001576.2001764.
  63. Dynamic limits for bloat control in genetic programming and a review of past and current bloat theories. Genetic Programming and Evolvable Machines, 10:141–179, 2009. URL https://api.semanticscholar.org/CorpusID:10925054.
  64. Operator equalisation, bloat and overfitting: A study on human oral bioavailability prediction. In Proceedings of the 11th Annual Conference on Genetic and Evolutionary Computation, GECCO ’09, page 1115–1122, New York, NY, USA, 2009. Association for Computing Machinery. ISBN 9781605583259. doi:10.1145/1569901.1570051. URL https://doi.org/10.1145/1569901.1570051.
  65. Bloat free genetic programming: Application to human oral bioavailability prediction. Int. J. Data Min. Bioinformatics, 6(6):585–601, nov 2012. ISSN 1748-5673. doi:10.1504/IJDMB.2012.050266. URL https://doi.org/10.1504/IJDMB.2012.050266.
  66. An analysis of the causes of code growth in genetic programming. Genetic Programming and Evolvable Machines, 3(3):283–309, September 2002. ISSN 1389-2576. doi:doi:10.1023/A:1020115409250.
  67. Effective simplification of evolved push programs using a simple, stochastic hill-climber. In Proceedings of the Companion Publication of the 2014 Annual Conference on Genetic and Evolutionary Computation, pages 147–148, 2014.
  68. Walter Alden Tackett. Recombination, Selection, and the Genetic Construction of Computer Programs. PhD thesis, USA, 1994.
  69. Genetic programming for multiple-feature construction on high-dimensional classification. Pattern Recognition, 93:404–417, 2019. ISSN 0031-3203. doi:https://doi.org/10.1016/j.patcog.2019.05.006. URL https://www.sciencedirect.com/science/article/pii/S0031320319301815.
  70. Accurate quantitative estimation of energy performance of residential buildings using statistical machine learning tools. Energy and Buildings, 49:560–567, 2012. ISSN 0378-7788. doi:https://doi.org/10.1016/j.enbuild.2012.03.003. URL https://www.sciencedirect.com/science/article/pii/S037877881200151X.
  71. Genetic programming for skin cancer detection in dermoscopic images. In 2017 IEEE Congress on Evolutionary Computation (CEC), page 2420–2427. IEEE Press, 2017. doi:10.1109/CEC.2017.7969598. URL https://doi.org/10.1109/CEC.2017.7969598.
  72. The role of explainable ai in the research field of ai ethics. ACM Trans. Interact. Intell. Syst., 13(4), dec 2023. ISSN 2160-6455. doi:10.1145/3599974. URL https://doi.org/10.1145/3599974.
  73. Implicitly controlling bloat in genetic programming. IEEE Transactions on Evolutionary Computation, 14(2):173–190, 2009.
  74. Algebraic simplification of gp programs during evolution. In Proc. 8th Annual Conference on Genetic and Evolutionary Computation (GECCO-2006), pages 927–934. ACM Press, New York, 2006.
  75. A survey of intron research in genetics. In Hans-Michael Voigt, Werner Ebeling, Ingo Rechenberg, and Hans-Paul Schwefel, editors, Parallel Problem Solving from Nature — PPSN IV, pages 101–110, Berlin, Heidelberg, 1996. Springer Berlin Heidelberg. ISBN 978-3-540-70668-7.
  76. I-Cheng Yeh. Concrete Compressive Strength. UCI Machine Learning Repository, 2007. DOI: https://doi.org/10.24432/C5PK67.
  77. I-Cheng Yeh. Concrete Slump Test. UCI Machine Learning Repository, 2009. DOI: https://doi.org/10.24432/C5FG7D.
  78. Multiclass object classification using genetic programming. In Günther R. Raidl, Stefano Cagnoni, Jürgen Branke, David Wolfe Corne, Rolf Drechsler, Yaochu Jin, Colin G. Johnson, Penousal Machado, Elena Marchiori, Franz Rothlauf, George D. Smith, and Giovanni Squillero, editors, Applications of Evolutionary Computing, pages 369–378, Berlin, Heidelberg, 2004. Springer Berlin Heidelberg. ISBN 978-3-540-24653-4.
Citations (1)

Summary

We haven't generated a summary for this paper yet.