Re-Simulation-based Self-Supervised Learning for Pre-Training Foundation Models (2403.07066v2)
Abstract: Self-Supervised Learning (SSL) is at the core of training modern large machine learning models, providing a scheme for learning powerful representations that can be used in a variety of downstream tasks. However, SSL strategies must be adapted to the type of training data and downstream tasks required. We propose RS3L ("Re-simulation-based self-supervised representation learning"), a novel simulation-based SSL strategy that employs a method of re-simulation to drive data augmentation for contrastive learning in the physical sciences, particularly, in fields that rely on stochastic simulators. By intervening in the middle of the simulation process and re-running simulation components downstream of the intervention, we generate multiple realizations of an event, thus producing a set of augmentations covering all physics-driven variations available in the simulator. Using experiments from high-energy physics, we explore how this strategy may enable the development of a foundation model; we show how RS3L pre-training enables powerful performance in downstream tasks such as discrimination of a variety of objects and uncertainty mitigation. In addition to our results, we make the RS3L dataset publicly available for further studies on how to improve SSL strategies.
- R. Bommasani et al., ``On the opportunities and risks of foundation models,'' (2022), arXiv:2108.07258 [cs.LG] .
- J. Pan, Nature Communication Science 3 (2023).
- J. M. Campbell et al., in Snowmass 2021 (2022) arXiv:2203.11110 [hep-ph] .
- G. Aad et al. (ATLAS), JINST 3, S08003 (2008).
- S. Chatrchyan et al. (CMS), JINST 3, S08004 (2008).
- L. Evans and P. Bryant, JINST 3, S08001 (2008).
- A. M. Sirunyan et al. (CMS), Eur. Phys. J. C 80, 4 (2020a), arXiv:1903.12179 [hep-ex] .
- A. M. Sirunyan et al. (CMS), Eur. Phys. J. C 79, 280 (2019a), arXiv:1811.06562 [hep-ex] .
- A. M. Sirunyan et al. (CMS), JHEP 03, 025 (2020b), arXiv:1908.01713 [hep-ex] .
- A. M. Sirunyan et al. (CMS), JHEP 12, 085 (2020c), arXiv:2006.13251 [hep-ex] .
- A. M. Sirunyan et al. (CMS), JHEP 01, 097 (2018a), arXiv:1710.00159 [hep-ex] .
- A. M. Sirunyan et al. (CMS), Phys. Rev. D 100, 112007 (2019b), arXiv:1909.04114 [hep-ex] .
- W. Kirch, ed., ``Pearson's correlation coefficient,'' in Encyclopedia of Public Health (Springer Netherlands, Dordrecht, 2008) pp. 1090–1091.
- L. van der Maaten and G. Hinton, Journal of Machine Learning Research 9, 2579 (2008).
- M. Aaboud et al. (ATLAS), Phys. Lett. B 786, 59 (2018), arXiv:1808.08238 [hep-ex] .
- A. M. Sirunyan et al. (CMS), Phys. Rev. Lett. 121, 121801 (2018b), arXiv:1808.08242 [hep-ex] .
- G. Aad et al. (ATLAS), Phys. Lett. B 816, 136204 (2021a), arXiv:2008.02508 [hep-ex] .
- G. Aad et al. (ATLAS), Eur. Phys. J. C 81, 178 (2021b), arXiv:2007.02873 [hep-ex] .
- A. Tumasyan et al. (CMS), (2023), arXiv:2312.07562 [hep-ex] .
- L. N. Vasersˇˇs\check{\mathrm{s}}overroman_ˇ start_ARG roman_s end_ARGteı˘˘italic-ı\breve{\i}over˘ start_ARG italic_ı end_ARGn, Probl. Peredachi Inf. 5 (1969).
- L. V. Kantorovich, Management Science 6, 366 (1960).
- X. Chen and K. He, CoRR abs/2011.10566 (2020), 2011.10566 .