Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Transformer Model for Symbolic Regression towards Scientific Discovery (2312.04070v2)

Published 7 Dec 2023 in cs.LG

Abstract: Symbolic Regression (SR) searches for mathematical expressions which best describe numerical datasets. This allows to circumvent interpretation issues inherent to artificial neural networks, but SR algorithms are often computationally expensive. This work proposes a new Transformer model aiming at Symbolic Regression particularly focused on its application for Scientific Discovery. We propose three encoder architectures with increasing flexibility but at the cost of column-permutation equivariance violation. Training results indicate that the most flexible architecture is required to prevent from overfitting. Once trained, we apply our best model to the SRSD datasets (Symbolic Regression for Scientific Discovery datasets) which yields state-of-the-art results using the normalized tree-based edit distance, at no extra computational cost.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (21)
  1. John R. Koza. Genetic Programming. The MIT Press, 1992.
  2. Fast Neural Models for Symbolic Regression at Scale. arXiv e-prints, art. arXiv:2007.10784, July 2020. doi: 10.48550/arXiv.2007.10784.
  3. Deep symbolic regression: Recovering mathematical expressions from data via risk-seeking policy gradients. In International Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=m5Qsh0kBQG.
  4. Integration of neural network-based symbolic regression in deep learning for scientific discovery. IEEE Transactions on Neural Networks and Learning Systems, 32(9):4166–4177, 2021. doi: 10.1109/TNNLS.2020.3017010.
  5. Discovering symbolic policies with deep reinforcement learning. In Marina Meila and Tong Zhang, editors, Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pages 5979–5989. PMLR, 18–24 Jul 2021. URL https://proceedings.mlr.press/v139/landajuela21a.html.
  6. Neural symbolic regression that scales. In Marina Meila and Tong Zhang, editors, Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pages 936–945. PMLR, 18–24 Jul 2021. URL https://proceedings.mlr.press/v139/biggio21a.html.
  7. Symbolicgpt: A generative transformer model for symbolic regression. In Preprint Arxiv, 2021. URL https://arxiv.org/abs/2106.14131. Under Review.
  8. End-to-end symbolic regression with transformers. In Alice H. Oh, Alekh Agarwal, Danielle Belgrave, and Kyunghyun Cho, editors, Advances in Neural Information Processing Systems, 2022. URL https://openreview.net/forum?id=GoOuIrDHG_Y.
  9. AI-Assisted Discovery of Quantitative and Formal Models in Social Science. arXiv e-prints, art. arXiv:2210.00563, October 2022. doi: 10.48550/arXiv.2210.00563.
  10. Privileged deep symbolic regression. In NeurIPS 2022 AI for Science: Progress and Promises, 2022. URL https://openreview.net/forum?id=Dzt-AGgpF0.
  11. Distilling free-form natural laws from experimental data. Science, 324, 2009. ISSN 00368075. doi: 10.1126/science.1165893.
  12. Contemporary symbolic regression methods and their relative performance. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1), 2021. URL https://openreview.net/forum?id=xVQMrDLyGst.
  13. Rethinking symbolic regression datasets and benchmarks for scientific discovery. arXiv preprint arXiv:2206.10540, 2022.
  14. Ai feynman: A physics-inspired method for symbolic regression. Science Advances, 6, 2020. ISSN 23752548. doi: 10.1126/sciadv.aay2631.
  15. Attention is all you need. In I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017. URL https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf.
  16. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, volume 2017-January, 2017. doi: 10.1109/CVPR.2017.16.
  17. Sympy: symbolic computing in python. PeerJ Computer Science, 3:e103, January 2017. ISSN 2376-5992. doi: 10.7717/peerj-cs.103. URL https://doi.org/10.7717/peerj-cs.103.
  18. Adam: A method for stochastic optimization. 3rd International Conference on Learning Representations, ICLR 2015 - Conference Track Proceedings, pages 1–15, 2015.
  19. Simple fast algorithms for the editing distance between trees and related problems. SIAM Journal on Computing, 18, 1989. ISSN 00975397. doi: 10.1137/0218082.
  20. GPLearn. Gplearn, genetic programming in python. https://gplearn.readthedocs.io/en/stable, 2023. Accessed: 2023-09-01.
  21. Age-fitness pareto optimization. In Genetic Programming Theory and Practice VIII, 2010. doi: 10.1145/1830483.1830584.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets