Evolving Scientific Discovery by Unifying Data and Background Knowledge with AI Hilbert (2308.09474v3)
Abstract: The discovery of scientific formulae that parsimoniously explain natural phenomena and align with existing background theory is a key goal in science. Historically, scientists have derived natural laws by manipulating equations based on existing knowledge, forming new equations, and verifying them experimentally. In recent years, data-driven scientific discovery has emerged as a viable competitor in settings with large amounts of experimental data. Unfortunately, data-driven methods often fail to discover valid laws when data is noisy or scarce. Accordingly, recent works combine regression and reasoning to eliminate formulae inconsistent with background theory. However, the problem of searching over the space of formulae consistent with background theory to find one that best fits the data is not well-solved. We propose a solution to this problem when all axioms and scientific laws are expressible via polynomial equalities and inequalities and argue that our approach is widely applicable. We model notions of minimal complexity using binary variables and logical constraints, solve polynomial optimization problems via mixed-integer linear or semidefinite optimization, and prove the validity of our scientific discoveries in a principled manner using Positivstellensatz certificates. The optimization techniques leveraged in this paper allow our approach to run in polynomial time with fully correct background theory under an assumption that the complexity of our derivation is bounded), or non-deterministic polynomial (NP) time with partially correct background theory. We demonstrate that some famous scientific laws, including Kepler's Third Law of Planetary Motion, the Hagen-Poiseuille Equation, and the Radiated Gravitational Wave Power equation, can be derived in a principled manner from axioms and experimental data.
- T. Achterberg. What’s new in Gurobi 9.0, 2019.
- Polynomial norms. SIAM Journal on Optimization, 29(1):399–422, 2019.
- Learning dynamical systems with side information. SIAM Review, 65(1):183–223, 2023.
- A. A. Ahmadi and A. Majumdar. DSOS and SDSOS optimization: More tractable alternatives to sum of squares and semidefinite optimization. SIAM Journal on Applied Algebra and Geometry, 3(2):193–230, 2019.
- Primal-dual interior-point methods for semidefinite programming: Convergence rates, stability and numerical results. SIAM Journal on Optimization, 8(3):746–768, 1998.
- The MOSEK interior point optimizer for linear programming: an implementation of the homogeneous algorithm. High Performance Optimization, pages 197–232, 2000.
- The decline of science in corporate R&D. Strategic Management Journal, 39(1):3–32, 2018.
- Free energy wells and overlap gap property in sparse PCA. In Conference on Learning Theory, pages 479–482. PMLR, 2020.
- E. Artin. Über die zerlegung definiter funktionen in quadrate. In Abhandlungen aus dem mathematischen Seminar der Universität Hamburg, volume 5, pages 100–115. Springer, 1927.
- Globally optimal symbolic regression. arXiv preprint arXiv:1710.10720, 2017.
- F. Bach. Sum-of-squares relaxations for information theory and variational inference. arXiv preprint arXiv:2206.13285, 2022.
- F. Bach and A. Rudi. Exponential convergence of sum-of-squares hierarchies for trigonometric polynomials. SIAM Journal on Optimization, 33(3):2137–2159, 2023.
- Artificial intelligence in chemistry: current trends and future directions. Journal of Chemical Information and Modeling, 61(7):3197–3212, 2021.
- J. S. Bell. On the Einstein Podolsky Rosen paradox. Physics Physique Fizika, 1(3):195, 1964.
- Optimal low-rank matrix completion: Semidefinite relaxations and eigenvector disjunctions. arXiv preprint arXiv:2305.12292, 2023.
- A unified approach to mixed-integer optimization problems with logical constraints. SIAM Journal on Optimization, 31(3):2340–2367, 2021.
- Mixed-projection conic optimization: A new paradigm for modeling rank constraints. Operations Research, 70(6):3321–3344, 2022.
- D. Bertsimas and J. Dunn. Machine learning under a modern optimization lens. Dynamic Ideas Press, 2019.
- D. Bertsimas and W. Gurnee. Learning sparse nonlinear dynamics via mixed-integer optimization. Nonlinear Dynamics, pages 1–20, 2023.
- Best subset selection via a modern optimization lens. The Annals of Statistics, 44(2):813 – 852, 2016.
- J. Bhattacharya and M. Packalen. Stagnation and scientific incentives. Technical report, National Bureau of Economic Research, 2020.
- R. Bixby and E. Rothberg. Progress in computational mixed integer programming–a look back from the other side of the tipping point. Annals of Operations Research, 149(1):37, 2007.
- Semidefinite optimization and convex algebraic geometry. SIAM, 2012.
- Are ideas getting harder to find? American Economic Review, 110(4):1104–1144, 2020.
- J. Bongard and H. Lipson. Automated reverse engineering of nonlinear dynamical systems. Proceedings of the National Academy of Sciences, 104(24):9943–9948, 2007.
- Discovering governing equations from data by sparse identification of nonlinear dynamical systems. Proceedings of the National Academy of Sciences, 113(15):3932–3937, 2016.
- Artificial intelligence and the modern productivity paradox: A clash of expectations and statistics. In The economics of artificial intelligence: An agenda, pages 23–57. University of Chicago Press, 2018.
- S. Burer and R. D. Monteiro. A nonlinear programming algorithm for solving semidefinite programs via low-rank factorization. Mathematical Programming, 95(2):329–357, 2003.
- E. J. Candes and Y. Plan. Matrix completion with noise. Proceedings of the IEEE, 98(6):925–936, 2010.
- Optical clocks and relativity. Science, 329(5999):1630–1633, 2010.
- Using the Gröebner basis algorithm to find proofs of unsatisfiability. In Proceedings of the twenty-eighth annual ACM symposium on Theory of computing, pages 174–183, 1996.
- Combining data and theory for derivable scientific discovery with AI-Descartes. Nature Communications, 14(1777), 2023.
- T. Cowen. The great stagnation: How America ate all the low-hanging fruit of modern history, got sick, and will (eventually) feel better: A Penguin eSpecial from Dutton. Penguin, 2011.
- Ideals, varieties, and algorithms: An introduction to computational algebraic geometry and commutative algebra. Springer Science & Business Media, 2013.
- A. Cozad and N. V. Sahinidis. A global MINLP approach to symbolic regression. Mathematical Programming, 170:97–119, 2018.
- M. Cranmer. PySR: Fast & parallelized symbolic regression in Python/Julia. doi.org/10.5281/zenodo.4041459, Sept. 2020.
- Discovering symbolic models from deep learning with inductive biases. NeurIPS 2020, 2020.
- M. Curmei and G. Hall. Shape-constrained regression using sum of squares polynomials. Operations Research, 2023.
- H. W. De Regt. Understanding, values, and the aims of science. Philosophy of Science, 87(5):921–932, 2020.
- Branch-and-bound solves random binary ips in polytime. In Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 579–591. SIAM, 2021.
- P. A. Dirac. Directions in physics. Lectures delivered during a visit to Australia and New Zealand, August/September 1975. 1978.
- R. Dubčáková. Eureqa: software review, 2011.
- Deterministic symbolic regression with derivative information: General methodology and application to equations of state. AIChE Journal, 68(6):e17457, 2022.
- A. Fahmi. Locality, Bell’s inequality and the GHZ theorem. Physics Letters A, 303(1):1–6, 2002.
- Learning dynamic polynomial proofs. Advances in Neural Information Processing Systems, 32, 2019.
- Semidefinite approximations of the matrix logarithm. Foundations of Computational Mathematics, 19:259–296, 2019.
- The Feynman lectures on physics; vol. i. American Journal of Physics, 33(9):750–752, 1965.
- M. Froissart. Constructive generalization of Bell’s inequalities. Nuovo Cimento B;(Italy), 64(2), 1981.
- Why big data and compute are not necessarily the path to big materials science. Communications Materials, 3(1):59, 2022.
- KeYmaera X: An axiomatic tactical theorem prover for hybrid systems. In Automated Deduction-CADE-25: 25th International Conference on Automated Deduction, Berlin, Germany, August 1-7, 2015, Proceedings 25, pages 527–538. Springer, 2015.
- D. Gamarnik. The overlap gap property: A topological barrier to optimizing over random structures. Proceedings of the National Academy of Sciences, 118(41):e2108492118, 2021.
- D. Gamarnik and I. Zadik. High dimensional regression with binary coefficients. estimating squared error and a phase transtition. In Conference on Learning Theory, pages 948–953. PMLR, 2017.
- F. Glover. Improved linear integer programming formulations of nonlinear integer problems. Management Science, 22(4):455–460, 1975.
- Bell’s theorem without inequalities. American Journal of Physics, 58(12):1131–1143, 1990.
- A. Griewank and P. L. Toint. On the existence of convex decompositions of partially separable functions. Mathematical Programming, 28:25–49, 1984.
- A Bayesian machine scientist to aid in the solution of challenging scientific problems. Science Advances, 6(5):eaav6971, 2020.
- A. Guntuboyina and B. Sen. Nonparametric shape-restricted regression. Statistical Science, 33(4):568–594, 2018.
- Branch-and-bound performance estimation programming: A unified methodology for constructing optimal optimization methods. Mathematical Programming, 2023.
- G. Hall. Applications of sums of squares polynomials. In P. Parrilo and R. Thomas, editors, Sum of Squares: Theory and Applications, volume 77. Proceedings of Symposia in Applied Mathematics, 2020.
- The elements of statistical learning: data mining, inference, and prediction, volume 2. Springer, 2009.
- D. Hilbert. Über die darstellung definiter formen als summe von formenquadraten. Mathematische Annalen, 32(3):342–350, 1888.
- D. Hilbert. Mathematical problems. In Mathematics, pages 273–278. Chapman and Hall/CRC, 2019.
- J. Huchette and J. P. Vielma. Nonconvex piecewise linear functions: Advanced formulations and simple modeling tools. Operations Research, 2022.
- Discovery of a pulsar in a binary system. The Astrophysical Journal, 195:L51–L53, 1975.
- Discovering physical concepts with neural networks. Physical Review Letters, 124(1):010508, 2020.
- On consistency and sparsity for principal components analysis in high dimensions. Journal of the American Statistical Association, 104(486):682–693, 2009.
- Highly accurate protein structure prediction with alphafold. Nature, 596(7873):583–589, 2021.
- Machine learning in the search for new fundamental physics. Nature Reviews Physics, 4(6):399–412, 2022.
- H. Kitano. Nobel turing challenge: creating the engine for scientific discovery. npj Systems Biology and Applications, 7(1):29, 2021.
- J.-L. Krivine. Anneaux préordonnés. Journal d’analyse mathématique, 12:p–307, 1964.
- Symbolic regression driven by training data and prior knowledge. In Proceedings of the 2020 Genetic and Evolutionary Computation Conference, pages 958–966, 2020.
- Multi-objective symbolic regression for physics-aware dynamic modeling. Expert Systems with Applications, 182:115210, 2021.
- A unified framework for deep symbolic regression. Advances in Neural Information Processing Systems, 35:33985–33998, 2022.
- J. B. Lasserre. Global optimization with polynomials and the problem of moments. SIAM Journal on Optimization, 11(3):796–817, 2001.
- J. B. Lasserre. A sum of squares approximation of nonnegative polynomials. SIAM Review, 49(4):651–669, 2007.
- M. Laurent. Sums of squares, moment matrices and optimization over polynomials. In Emerging Applications of Algebraic Geometry, pages 157–270. Springer, 2009.
- M. Laurent and F. Rendl. Semidefinite programming and integer programming. Handbooks in Operations Research and Management Science, 12:393–514, 2005.
- Low-rank univariate sum of squares has no spurious local minima. SIAM Journal on Optimization, 33(3):2041–2061, 2023.
- E. Lim and P. W. Glynn. Consistency of multidimensional convex regression. Operations Research, 60(1):196–208, 2012.
- Okridge: Scalable optimal k-sparse ridge regression. Advances in Neural Information Processing Systems, 36, 2024.
- J. Lofberg and P. A. Parrilo. From coefficients to samples: A new approach to SOS optimization. In 2004 43rd IEEE Conference on Decision and Control (CDC)(IEEE Cat. No. 04CH37601), volume 3, pages 3154–3159. IEEE, 2004.
- Jump 1.0: recent improvements to a modeling language for mathematical optimization. Mathematical Programming Computation, pages 1–9, 2023.
- SRSD: Rethinking datasets of symbolic regression for scientific discovery. In NeurIPS 2022 AI for Science: Progress and Promises, 2022.
- Y. Nesterov and A. Nemirovskii. Interior-point polynomial algorithms in convex programming. SIAM, 1994.
- Self-scaled barriers and interior-point methods for convex programming. Mathematics of Operations Research, 22(1):1–42, 1997.
- OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774, 2023.
- P. A. Parrilo. Semidefinite programming relaxations for semialgebraic problems. Mathematical Programming, 96(2):293–320, 2003.
- F. Permenter and P. Parrilo. Partial facial reduction: Simplified, equivalent SDPs via approximations of the PSD cone. Mathematical Programming, 171:1–54, 2018.
- P. C. Peters and J. Mathews. Gravitational radiation from point masses in a Keplerian orbit. Physical Review, 131(1):435, 1963.
- M. Putinar. Positive polynomials on compact semi-algebraic sets. Indiana University Mathematics Journal, 42(3):969–984, 1993.
- M. V. Ramana. An exact duality theory for semidefinite programming and its complexity implications. Mathematical Programming, 77:129–162, 1997.
- J. Renegar. A mathematical view of interior-point methods in convex optimization. SIAM, 2001.
- Interactive supercomputing on 40,000 cores for machine learning and data analysis. In 2018 IEEE High Performance extreme Computing Conference (HPEC), pages 1–6. IEEE, 2018.
- B. Reznick. Extremal PSD forms with few terms. Duke Mathematical Journal, 45(2):363–374, 1978.
- C. Rudin. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, 1(5):206–215, 2019.
- Data-driven discovery of partial differential equations. Science Advances, 3(4):e1602614, 2017.
- J. L. Russell. Kepler’s laws of planetary motion: 1609–1666. The British Journal for the History of Science, 2(1):1–24, 1964.
- M. Schmidt and H. Lipson. Distilling free-form natural laws from experimental data. Science, 324(5923):81–85, 2009.
- M. Schmidt and H. Lipson. Symbolic regression of implicit equations. In Genetic Programming Theory and Practice VII, pages 73–85. Springer, 2009.
- K. Schmüdgen. The moment problem on compact semi-algebraic sets. Mathematische Annalen, 289:283–313, 1991.
- H. A. Simon. Does scientific discovery have a logic? Philosophy of Science, 40(4):471–480, 1973.
- A. Skajaa and Y. Ye. A homogeneous interior-point algorithm for nonsymmetric convex conic optimization. Mathematical Programming, 150(2):391–422, 2015.
- G. Stengle. A Nullstellensatz and a Positivstellensatz in semialgebraic geometry. Mathematische Annalen, 207(2):87–97, 1974.
- R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society Series B: Statistical Methodology, 58(1):267–288, 1996.
- AI Feynman 2.0: Pareto-optimal symbolic regression exploiting graph modularity. In H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, 2020.
- AI Feynman 2.0: Pareto-optimal symbolic regression exploiting graph modularity. Advances in Neural Information Processing Systems, 33:4860–4871, 2020.
- S.-M. Udrescu and M. Tegmark. AI Feynman: A physics-inspired method for symbolic regression. Science Advances, 6(16), 2020.
- S.-M. Udrescu and M. Tegmark. AI Feynman: A physics-inspired method for symbolic regression. Science Advances, 2020.
- Chordal graphs and semidefinite optimization. Foundations and Trends® in Optimization, 1(4):241–433, 2015.
- Scientific discovery in the age of artificial intelligence. Nature, 620(7972):47–60, 2023.
- Information-theoretic limits on sparse signal recovery: Dense versus sparse measurement matrices. IEEE Transactions on Information Theory, 56(6):2967–2979, 2010.
- T. Yu and H. Zhu. Hyper-parameter optimization: A review of algorithms and applications. arXiv preprint arXiv:2003.05689, 2020.
- W. Zhao and G. Zhou. Hausdorff distance between convex semialgebraic sets. Journal of Global Optimization, pages 1–21, 2023.
- Sieve-SDP: A simple facial reduction algorithm to preprocess semidefinite programs. Mathematical Programming Computation, 11:503–586, 2019.