Symbolic Regression on Sparse and Noisy Data with Gaussian Processes
Abstract: In this paper, we address the challenge of deriving dynamical models from sparse and noisy data. High-quality data is crucial for symbolic regression algorithms; limited and noisy data can present modeling challenges. To overcome this, we combine Gaussian process regression with a sparse identification of nonlinear dynamics (SINDy) method to denoise the data and identify nonlinear dynamical equations. Our approach GPSINDy offers improved robustness with sparse, noisy data compared to SINDy alone. We demonstrate its effectiveness on simulation data from Lotka-Volterra and unicycle models and hardware data from an NVIDIA JetRacer system. We show superior performance over baselines including more than 50% improvement over SINDy and other baselines in predicting future trajectories from noise-corrupted and sparse 5 Hz data.
- S. L. Brunton, J. L. Proctor, and J. N. Kutz, “Discovering governing equations from data by sparse identification of nonlinear dynamical systems,” Proceedings of the National Academy of Sciences, vol. 113, no. 15, pp. 3932–3937, 2016.
- A. Cortiella, K.-C. Park, and A. Doostan, “Sparse identification of nonlinear dynamical systems via reweighted ℓℓ\ellroman_ℓ1-regularized least squares,” Computer Methods in Applied Mechanics and Engineering, vol. 376, p. 113620, 2021.
- J. Wentz and A. Doostan, “Derivative-based SINDy (DSINDy): Addressing the challenge of discovering governing equations from noisy data,” Computer Methods in Applied Mechanics and Engineering, vol. 413, p. 116096, Aug. 2023. arXiv:2211.05918 [math].
- E. Kaiser, J. N. Kutz, and S. L. Brunton, “Sparse identification of nonlinear dynamics for model predictive control in the low-data limit,” Proceedings of the Royal Society A, 2018.
- C. E. Rasmussen and C. K. Williams, Gaussian Processes for Machine Learning. MIT Press, 2006.
- U. Fasel, J. N. Kutz, B. W. Brunton, and S. L. Brunton, “Ensemble-SINDy: Robust sparse model discovery in the low-data, high-noise limit, with active learning and control,” Proceedings of the Royal Society A, vol. 478, no. 2260, p. 20210904, 2022.
- H. Schaeffer and S. G. McCalla, “Sparse model selection via integral terms,” Physical Review E, vol. 96, no. 2, p. 023302, 2017.
- L. Boninsegna, F. Nüske, and C. Clementi, “Sparse learning of stochastic dynamical equations,” The Journal of chemical physics, vol. 148, no. 24, 2018.
- K. Kaheman, J. N. Kutz, and S. L. Brunton, “Sindy-pi: a robust algorithm for parallel implicit sparse identification of nonlinear dynamics,” Proceedings of the Royal Society A, 2020.
- G. L’Erario, L. Fiorio, G. Nava, F. Bergonti, H. A. O. Mohamed, E. Benenati, S. Traversaro, and D. Pucci, “Modeling, identification and control of model jet engines for jet powered robotics,” IEEE Robotics and Automation Letters, vol. 5, no. 2, pp. 2070–2077, 2020.
- M. Hoffmann, C. Fröhner, and F. Noé, “Reactive sindy: Discovering governing reactions from concentration data,” The Journal of chemical physics, vol. 150, no. 2, 2019.
- Z. Chen, Y. Liu, and H. Sun, “Physics-informed learning of governing equations from scarce data,” Nature communications, 2021.
- K. Kaheman, S. L. Brunton, and J. N. Kutz, “Automatic differentiation to simultaneously identify nonlinear dynamics and extract noise probability distributions from data,” Machine Learning: Science and Technology, vol. 3, no. 1, p. 015031, 2022.
- R. T. Chen, Y. Rubanova, J. Bettencourt, and D. K. Duvenaud, “Neural ordinary differential equations,” Advances in neural information processing systems, vol. 31, 2018.
- M. Raissi, P. Perdikaris, and G. E. Karniadakis, “Machine learning of linear differential equations using gaussian processes,” Journal of Computational Physics, vol. 348, pp. 683–693, 2017.
- M. Heinonen, C. Yildiz, H. Mannerström, J. Intosalmi, and H. Lähdesmäki, “Learning unknown ode models with gaussian processes,” in International conference on machine learning, pp. 1959–1968, PMLR, 2018.
- R. Tibshirani, Regression Shrinkage and Selection via the Lasso. Oxford University Press, 1996.
- S. V. Vaseghi, Advanced digital signal processing and noise reduction. John Wiley & Sons, 2008.
- S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, “Distributed optimization and statistical learning via the alternating direction method of multipliers,” Foundations and Trends in Machine Learning, vol. 3, pp. 1–122, 01 2011.
- K. Ito and R. Nakano, “Optimizing support vector regression hyperparameters based on cross-validation,” in Proceedings of the International Joint Conference on Neural Networks, IEEE, 2003.
- D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” CoRR, vol. abs/1412.6980, 2015.
- V. Křivan, “Prey–predator models,” in Encyclopedia of Ecology (S. E. Jørgensen and B. D. Fath, eds.), pp. 2929–2940, Oxford: Academic Press, 2008.
- H. C. Lingmont, F. Alijani, and M. A. Bessa, “Data-driven techniques for finding governing equations of noisy nonlinear dynamical systems,” 2020.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.