Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
8 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SymbolNet: Neural Symbolic Regression with Adaptive Dynamic Pruning for Compression (2401.09949v3)

Published 18 Jan 2024 in cs.LG, hep-ex, and physics.ins-det

Abstract: Compact symbolic expressions have been shown to be more efficient than neural network models in terms of resource consumption and inference speed when implemented on custom hardware such as FPGAs, while maintaining comparable accuracy~\cite{tsoi2023symbolic}. These capabilities are highly valuable in environments with stringent computational resource constraints, such as high-energy physics experiments at the CERN Large Hadron Collider. However, finding compact expressions for high-dimensional datasets remains challenging due to the inherent limitations of genetic programming, the search algorithm of most symbolic regression methods. Contrary to genetic programming, the neural network approach to symbolic regression offers scalability to high-dimensional inputs and leverages gradient methods for faster equation searching. Common ways of constraining expression complexity often involve multistage pruning with fine-tuning, which can result in significant performance loss. In this work, we propose $\tt{SymbolNet}$, a neural network approach to symbolic regression specifically designed as a model compression technique, aimed at enabling low-latency inference for high-dimensional inputs on custom hardware such as FPGAs. This framework allows dynamic pruning of model weights, input features, and mathematical operators in a single training process, where both training loss and expression complexity are optimized simultaneously. We introduce a sparsity regularization term for each pruning type, which can adaptively adjust its strength, leading to convergence at a target sparsity ratio. Unlike most existing symbolic regression methods that struggle with datasets containing more than $\mathcal{O}(10)$ inputs, we demonstrate the effectiveness of our model on the LHC jet tagging task (16 inputs), MNIST (784 inputs), and SVHN (3072 inputs).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (46)
  1. M. Planck, “On an improvement of Wien’s equation for the spectrum,” Verh. Dtsch. Phys. Ges., vol. 2, 1900.
  2. M. Virgolin and S. P. Pissis, “Symbolic regression is NP-hard,” arXiv:2207.01018, 2022.
  3. M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mané, R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan, F. Viégas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng, “TensorFlow: Large-scale machine learning on heterogeneous systems,” 2015, software available from tensorflow.org. [Online]. Available: https://www.tensorflow.org/
  4. J. Liu, Z. Xu, R. Shi, R. C. C. Cheung, and H. K. H. So, “Dynamic sparse training: Find efficient sparse network from scratch with trainable masked layers,” arXiv:2005.06870, 2020.
  5. J. Koza, “Genetic programming as a means for programming computers by natural selection,” Statistics and Computing, vol. 4, pp. 87–112, 1994.
  6. M. Schmidt and H. Lipson, “Distilling free-form natural laws from experimental data,” Science, vol. 324, pp. 81–85, 2009. [Online]. Available: https://doi.org/10.1126/science.1165893
  7. M. Cranmer, “Interpretable machine learning for science with PySR and SymbolicRegression.jl,” arXiv:2305.01582, 2023.
  8. D. Wadekar, F. Villaescusa-Navarro, S. Ho, and L. Perreault-Levasseur, “Modeling assembly bias with machine learning and symbolic regression,” arXiv:2012.00111, 2020.
  9. H. Shao, F. Villaescusa-Navarro, S. Genel, D. N. Spergel, D. Anglé s-Alcázar, L. Hernquist, R. Davé, D. Narayanan, G. Contardo, and M. Vogelsberger, “Finding universal relations in subhalo properties with artificial intelligence,” The Astrophysical Journal, vol. 927, no. 1, p. 85, 2022. [Online]. Available: https://doi.org/10.3847%2F1538-4357%2Fac4d30
  10. A. M. Delgado, D. Wadekar, B. Hadzhiyska, S. Bose, L. Hernquist, and S. Ho, “Modelling the galaxy–halo connection with machine learning,” Monthly Notices of the Royal Astronomical Society, vol. 515, no. 2, pp. 2733–2746, 2022. [Online]. Available: https://doi.org/10.1093%2Fmnras%2Fstac1951
  11. D. Wadekar, L. Thiele, J. C. Hill, S. Pandey, F. Villaescusa-Navarro, D. N. Spergel, M. Cranmer, D. Nagai, D. Anglé s-Alcázar, S. Ho, and L. Hernquist, “The SZ flux-mass (Y−M𝑌𝑀Y-Mitalic_Y - italic_M) relation at low-halo masses: improvements with symbolic regression and strong constraints on baryonic feedback,” Monthly Notices of the Royal Astronomical Society, vol. 522, no. 2, pp. 2628–2643, 2023. [Online]. Available: https://doi.org/10.1093%2Fmnras%2Fstad1128
  12. P. Lemos, N. Jeffrey, M. Cranmer, S. Ho, and P. Battaglia, “Rediscovering orbital mechanics with machine learning,” arXiv:2202.02306, 2022.
  13. D. Wadekar, L. Thiele, F. Villaescusa-Navarro, J. C. Hill, M. Cranmer, D. N. Spergel, N. Battaglia, D. Anglé s-Alcázar, L. Hernquist, and S. Ho, “Augmenting astrophysical scaling relations with machine learning: Application to reducing the sunyaev–zeldovich flux–mass scatter,” Proceedings of the National Academy of Sciences, vol. 120, no. 12, 2023. [Online]. Available: https://doi.org/10.1073%2Fpnas.2202074120
  14. A. Grundner, T. Beucler, P. Gentine, and V. Eyring, “Data-driven equation discovery of a cloud cover parameterization,” arXiv:2304.08063, 2023.
  15. T. Stephens, “Genetic programming in Python, with a scikit-learn inspired API: gplearn,” 2016. [Online]. Available: https://gplearn.readthedocs.io/en/stable/
  16. B. Burlacu, G. Kronberger, and M. Kommenda, “Operon C++: An efficient genetic programming framework for symbolic regression,” in Proceedings of the 2020 Genetic and Evolutionary Computation Conference Companion, ser. GECCO ’20.   New York, NY, USA: Association for Computing Machinery, 2020, p. 1562–1570. [Online]. Available: https://doi.org/10.1145/3377929.3398099
  17. M. Virgolin, T. Alderliesten, C. Witteveen, and P. A. N. Bosman, “Improving model-based genetic programming for symbolic regression of small expressions,” Evolutionary Computation, vol. 29, no. 2, pp. 211–237, 2021. [Online]. Available: https://doi.org/10.1162%2Fevco_a_00278
  18. G. Martius and C. H. Lampert, “Extrapolation and learning equations,” 2016.
  19. S. S. Sahoo, C. H. Lampert, and G. Martius, “Learning equations for extrapolation and control,” 2018.
  20. M. Werner, A. Junginger, P. Hennig, and G. Martius, “Informed equation learning,” 2021.
  21. S. Kim, P. Y. Lu, S. Mukherjee, M. Gilbert, L. Jing, V. Ceperic, and M. Soljacic, “Integration of neural network-based symbolic regression in deep learning for scientific discovery,” IEEE Transactions on Neural Networks and Learning Systems, vol. 32, no. 9, pp. 4166–4177, 2021. [Online]. Available: https://doi.org/10.1109%2Ftnnls.2020.3017010
  22. I. A. Abdellaoui and S. Mehrkanoon, “Symbolic regression for scientific discovery: an application to wind speed forecasting,” arXiv:2102.10570, 2021.
  23. A. Costa, R. Dangovski, O. Dugan, S. Kim, P. Goyal, M. Soljačić, and J. Jacobson, “Fast neural models for symbolic regression at scale,” arXiv:2007.10784, 2021.
  24. B. K. Petersen, M. L. Larma, T. N. Mundhenk, C. P. Santiago, S. K. Kim, and J. T. Kim, “Deep symbolic regression: Recovering mathematical expressions from data via risk-seeking policy gradients,” in International Conference on Learning Representations, 2021. [Online]. Available: https://openreview.net/forum?id=m5Qsh0kBQG
  25. H. Zhou and W. Pan, “Bayesian learning to discover mathematical operations in governing equations of dynamic systems,” arXiv:2206.00669, 2022.
  26. J. Kubalík, E. Derner, and R. Babuška, “Toward physically plausible data-driven models: A novel neural network approach to symbolic regression,” IEEE Access, vol. 11, pp. 61 481–61 501, 2023. [Online]. Available: https://doi.org/10.1109%2Faccess.2023.3287397
  27. L. Biggio, T. Bendinelli, A. Neitz, A. Lucchi, and G. Parascandolo, “Neural symbolic regression that scales,” in Proceedings of the 38th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, M. Meila and T. Zhang, Eds., vol. 139.   PMLR, 18–24 Jul 2021, pp. 936–945. [Online]. Available: https://proceedings.mlr.press/v139/biggio21a.html
  28. M. Valipour, B. You, M. Panju, and A. Ghodsi, “Symbolicgpt: A generative transformer model for symbolic regression,” 2021.
  29. P.-A. Kamienny, S. d’Ascoli, G. Lample, and F. Charton, “End-to-end symbolic regression with transformers,” in Advances in Neural Information Processing Systems, 2022.
  30. M. Vastl, J. Kulhánek, J. Kubalík, E. Derner, and R. Babuška, “Symformer: End-to-end symbolic regression using transformer-based architecture,” arXiv:2205.15764, 2022.
  31. A. Meurer, C. P. Smith, M. Paprocki, O. Čertík, S. B. Kirpichev, M. Rocklin, A. Kumar, S. Ivanov, J. K. Moore, S. Singh, T. Rathnayake, S. Vig, B. E. Granger, R. P. Muller, F. Bonazzi, H. Gupta, S. Vats, F. Johansson, F. Pedregosa, M. J. Curry, A. R. Terrel, v. Roučka, A. Saboo, I. Fernando, S. Kulal, R. Cimrman, and A. Scopatz, “Sympy: symbolic computing in python,” PeerJ Computer Science, vol. 3, p. e103, Jan. 2017. [Online]. Available: https://doi.org/10.7717/peerj-cs.103
  32. H. F. Tsoi, A. A. Pol, V. Loncar, E. Govorkova, M. Cranmer, S. Dasu, P. Elmer, P. Harris, I. Ojalvo, and M. Pierini, “Symbolic regression on FPGAs for fast machine learning inference,” arXiv:2305.04099, 2023.
  33. M. Pierini, J. M. Duarte, N. Tran, and M. Freytsis, “HLS4ML LHC Jet dataset (150 particles),” 2020. [Online]. Available: https://doi.org/10.5281/zenodo.3602260
  34. Y. LeCun and C. Cortes, “MNIST handwritten digit database,” 2010. [Online]. Available: http://yann.lecun.com/exdb/mnist/
  35. Y. Netzer, T. Wang, A. Coates, A. Bissacco, B. Wu, and A. Y. Ng, “Reading digits in natural images with unsupervised feature learning,” in NIPS Workshop on Deep Learning and Unsupervised Feature Learning 2011, 2011. [Online]. Available: http://ufldl.stanford.edu/housenumbers/nips2011_housenumbers.pdf
  36. ATLAS Collaboration, “Technical Design Report for the Phase-II Upgrade of the ATLAS TDAQ System,” CERN-LHCC-2017-020, ATLAS-TDR-029, 2017.
  37. CMS Collaboration, “The Phase-2 Upgrade of the CMS Level-1 Trigger,” CERN-LHCC-2020-004, CMS-TDR-021, 2020.
  38. J. Duarte et al., “Fast inference of deep neural networks in FPGAs for particle physics,” JINST, vol. 13, no. 07, p. P07027, 2018.
  39. E. A. Moreno, O. Cerri, J. M. Duarte, H. B. Newman, T. Q. Nguyen, A. Periwal, M. Pierini, A. Serikova, M. Spiropulu, and J.-R. Vlimant, “JEDI-net: a jet identification algorithm based on interaction networks,” Eur. Phys. J. C, vol. 80, no. 1, p. 58, 2020.
  40. E. Coleman, M. Freytsis, A. Hinzmann, M. Narain, J. Thaler, N. Tran, and C. Vernieri, “The importance of calorimetry for highly-boosted jet substructure,” JINST, vol. 13, no. 01, p. T01003, 2018.
  41. T. Aarrestad et al., “Fast convolutional neural networks on FPGAs with hls4ml,” Mach. Learn. Sci. Tech., vol. 2, no. 4, p. 045015, 2021.
  42. C. N. Coelho, A. Kuusela, S. Li, H. Zhuang, T. Aarrestad, V. Loncar, J. Ngadiuba, M. Pierini, A. A. Pol, and S. Summers, “Automatic heterogeneous quantization of deep neural networks for low-latency inference on the edge for particle detectors,” Nature Mach. Intell., vol. 3, pp. 675–686, 2021.
  43. M. Zhu and S. Gupta, “To prune, or not to prune: exploring the efficacy of pruning for model compression,” 2017.
  44. FastML Team, “fastmachinelearning/hls4ml,” 2021. [Online]. Available: https://github.com/fastmachinelearning/hls4ml
  45. Xilinx, “Vivado Design Suite User Guide: High-Level Synthesis,” https://www.xilinx.com/support/documentation/sw_manuals/xilinx2020_1/ug902-vivado-high-level-synthesis.pdf, 2020.
  46. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” 2017.

Summary

We haven't generated a summary for this paper yet.