Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
149 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

FAIR AI Models in High Energy Physics (2212.05081v3)

Published 9 Dec 2022 in hep-ex, cs.LG, and physics.comp-ph

Abstract: The findable, accessible, interoperable, and reusable (FAIR) data principles provide a framework for examining, evaluating, and improving how data is shared to facilitate scientific discovery. Generalizing these principles to research software and other digital products is an active area of research. Machine learning (ML) models -- algorithms that have been trained on data without being explicitly programmed -- and more generally, AI models, are an important target for this because of the ever-increasing pace with which AI is transforming scientific domains, such as experimental high energy physics (HEP). In this paper, we propose a practical definition of FAIR principles for AI models in HEP and describe a template for the application of these principles. We demonstrate the template's use with an example AI model applied to HEP, in which a graph neural network is used to identify Higgs bosons decaying to two bottom quarks. We report on the robustness of this FAIR AI model, its portability across hardware architectures and software frameworks, and its interpretability.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (95)
  1. CMS Collaboration, “Observation of a new boson at a mass of 125 GeV with the CMS experiment at the LHC”, Phys. Lett. B 716 (2012) 30, doi:10.1016/j.physletb.2012.08.021, arXiv:1207.7235.
  2. ATLAS Collaboration, “Observation of a new particle in the search for the standard model Higgs boson with the ATLAS detector at the LHC”, Phys. Lett. B 716 (2012) 1, doi:10.1016/j.physletb.2012.08.020, arXiv:1207.7214.
  3. CMS Collaboration, “Observation of Higgs boson decay to bottom quarks”, Phys. Rev. Lett. 121 (2018) 121801, doi:10.1103/PhysRevLett.121.121801, arXiv:1808.08242.
  4. ATLAS Collaboration, “Observation of H→b⁢b¯→𝐻𝑏¯𝑏H\rightarrow b\bar{b}italic_H → italic_b over¯ start_ARG italic_b end_ARG decays and V⁢H𝑉𝐻VHitalic_V italic_H production with the ATLAS detector”, Phys. Lett. B 786 (2018) 59, doi:10.1016/j.physletb.2018.09.013, arXiv:1808.08238.
  5. J. Duarte et al., “Fast inference of deep neural networks in FPGAs for particle physics”, JINST 13 (2018) P07027, doi:10.1088/1748-0221/13/07/P07027, arXiv:1804.06913.
  6. CMS Collaboration, “The Phase-2 upgrade of the CMS Level-1 trigger”, CMS Technical Design Report CERN-LHCC-2020-004. CMS-TDR-021, 2020.
  7. M. D. Wilkinson et al., “The FAIR guiding principles for scientific data management and stewardship”, Sci. Data 3 (2016) 160018, doi:10.1038/sdata.2016.18.
  8. D. S. Katz et al., “A fresh look at FAIR for research software”, arXiv:2101.10883.
  9. D. S. Katz, M. Gruenpeter, and T. Honeyman, “Taking a fresh look at FAIR for research software”, Patterns 2 (2021) 100222, doi:10.1016/j.patter.2021.100222.
  10. N. P. Chue Hong et al., “FAIR Principles for Research Software (FAIR4RS Principles)”, 2022. doi:10.15497/RDA00068.
  11. M. Barker et al., “Introducing the FAIR principles for research software”, Sci. Data 9 (2022) 622, doi:10.1038/s41597-022-01710-x.
  12. G. Verma et al., “HPCFAIR: Enabling FAIR AI for HPC applications”, in 2021 IEEE/ACM Workshop on Machine Learning in High Performance Computing Environments (MLHPC), p. 58. 2021. doi:10.1109/MLHPC54614.2021.00011.
  13. N. Ravi et al., “FAIR principles for AI models with a practical application for accelerated high energy diffraction microscopy”, Sci. Data 9 (2022), no. 1, 657, doi:10.1038/s41597-022-01712-9, arXiv:2207.00611.
  14. B. Haibe-Kains et al., “Transparency and reproducibility in artificial intelligence”, Nature 586 (2020) E14, doi:10.1038/s41586-020-2766-y.
  15. CMS Collaboration and J. Duarte, “Sample with jet, track and secondary vertex properties for Hbb tagging ML studies (HiggsToBBNTuple_HiggsToBB_QCD_RunII_13TeV_MC)”, 2019. CERN Open Data Portal. doi:10.7483/OPENDATA.CMS.JGJX.MS7Q.
  16. Y. Chen et al., “A FAIR and AI-ready Higgs boson decay dataset”, Sci. Data 9 (2022) 31, doi:10.1038/s41597-021-01109-0, arXiv:2108.02214.
  17. T. McCauley, “Open Data at CMS: Status and Plans”, in Proceedings of 7th Annual Conference on Large Hadron Collider Physics — PoS(LHCP2019), volume 350, p. 260. 2019. doi:10.22323/1.350.0260.
  18. E. A. Moreno et al., “Interaction networks for the identification of boosted H→b⁢b¯→𝐻𝑏¯𝑏H\to b\overline{b}italic_H → italic_b over¯ start_ARG italic_b end_ARG decays”, Phys. Rev. D 102 (2020) 012010, doi:10.1103/PhysRevD.102.012010, arXiv:1909.12285.
  19. G. Benelli et al., “Data Science and Machine Learning in Education”, in 2022 Snowmass Summer Study. 2022. arXiv:2207.09060.
  20. J. Duarte and F. Wurthwein, “Jupyter Notebooks for Particle Physics and Machine Learning, UCSD Data Science Capstone Particle Physics Domain”. https://jmduarte.github.io/capstone-particle-physics-domain, 05, 2021. doi:10.5281/zenodo.4768816.
  21. J. Duarte, P. McCormack, and D. Rankin, “IAIFI Summer School Tutorials”, 2022. doi:10.5281/zenodo.6954199, https://jduarte.physics.ucsd.edu/iaifi-summer-school/.
  22. R. Hanisch et al., “Stop squandering data: make units of measurement machine-readable”, Nature 605 (2022) 222, doi:10.1038/d41586-022-01233-w.
  23. F. Pedregosa et al., “Scikit-learn: Machine learning in Python”, J. Mach. Learn. Res. 12 (2011) 2825.
  24. M. Abadi et al., “TensorFlow: Large-scale machine learning on heterogeneous systems”, 2015. https://www.tensorflow.org/.
  25. A. Paszke et al., “PyTorch: An imperative style, high-performance deep learning library”, in Advances in Neural Information Processing Systems, H. Wallach et al., eds., volume 32. Curran Associates, Inc., 2019.
  26. T. Chen and C. Guestrin, “XGBoost”, in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2016. arXiv:1603.02754. doi:10.1145/2939672.2939785.
  27. J. Bai, F. Lu, K. Zhang et al., “Open Neural Network Exchange”. https://github.com/onnx/onnx, 2017. https://github.com/onnx/onnx.
  28. Meta AI Research, “Papers with code”, 2022. https://paperswithcode.com.
  29. S. Wattanakriengkrai et al., “GitHub repositories with links to academic papers: Public access, traceability, and evolution”, J. Syst. Softw. 183 (2022) 111117, doi:10.1016/j.jss.2021.111117.
  30. J. Pineau et al., “Improving reproducibility in machine learning research (a report from the NeurIPS 2019 Reproducibility Program)”, J. Mach. Learn. Res. 22 (2021) 1, arXiv:2003.12206.
  31. K. Sinha et al., “ML reproducibility challenge 2022”, 2022. https://paperswithcode.com/rc2022.
  32. D. S. Katz, “Defining FAIR for machine learning (ML)”, 2021. https://www.rd-alliance.org/defining-fair-machine-learning-ml.
  33. D. S. Katz, “FAIR software and FAIR ML models”, 2022. doi:10.5281/zenodo.6647819, https://doi.org/10.5281/zenodo.6647819.
  34. F. Psomopoulos and D. S. Katz, “FAIR for machine learning (FAIR4ML) IG charter”, 2022. https://www.rd-alliance.org/group/fair-machine-learning-fair4ml-ig/case-statement/fair-machine-learning-fair4ml-ig-charter.
  35. PyTorch Team, “PyTorch GitHub Issue #87398: Model outputs different values after ONNX export”, 2022. https://github.com/pytorch/pytorch/issues/87398#issuecomment-1338230472.
  36. Driven Data, “Cookiecutter data science”, 2022. https://drivendata.github.io/cookiecutter-data-science/.
  37. FAIR4HEP, “Cookiecutter4fair: v1.0.0”, 2022. doi:10.5281/zenodo.7306229, https://github.com/fair4hep/cookiecutter4fair.
  38. R. Luger et al., “Mapping stellar surfaces III: An Efficient, Scalable, and Open-Source Doppler Imaging Model”, 2021. arXiv:2110.06271.
  39. A. R. Greenfeld et al., “Cookiecutter”, 2022. https://github.com/cookiecutter/cookiecutter.
  40. Pallets, “Jinja”, 2022. https://github.com/pallets/jinja/.
  41. European Organization For Nuclear Research and OpenAIRE, “Zenodo”, 2013. doi:10.25495/7gxk-rd71, https://www.zenodo.org/.
  42. D. Völgyes, “Zenodo_get: A downloader for zenodo records”, 2020. doi:10.5281/zenodo.1261812, https://github.com/dvolgyes/zenodo_get.
  43. Z. Li et al., “DLHub: Simplifying publication, discovery, and use of machine learning models in science”, J. Parallel. Distrib. Comput. 147 (2021) 64, doi:10.1016/j.jpdc.2020.08.006.
  44. K. Chard et al., “Globus Nexus: A platform-as-a-service provider of research identity, profile, and group management”, Future Gener. Comput. Syst. 56 (2016) 571, doi:10.1016/j.future.2015.09.006.
  45. J. Vanschoren, J. N. van Rijn, B. Bischl, and L. Torgo, “OpenML: Networked science in machine learning”, SIGKDD Explorations 15 (2013), no. 2, 49.
  46. MLCommons, “MLCommons”, 2022. https://mlcommons.org.
  47. AI Model Share Project, “AI Model Share Platform”, 2022. https://www.modelshare.org/.
  48. T. Wolf et al., “Transformers: State-of-the-Art Natural Language Processing”, in Conference on Empirical Methods in Natural Language Processing, p. 38. Association for Computational Linguistics, 10, 2020.
  49. S. Luccioni, S. Bouchot, C. Akiki, and A. Leroy, “Introducing DOI: the digital object identifier to datasets and models”, 2022. https://huggingface.co/blog/introducing-doi.
  50. NVIDIA, “NVIDIA Triton Inference Server”. https://developer.nvidia.com/nvidia-triton-inference-server, 2022.
  51. D. Merkel, “Docker: Lightweight Linux containers for consistent development and deployment”, Linux J. 2014 (2014), no. 239,.
  52. G. M. Kurtzer, V. Sochat, and M. W. Bauer, “Singularity: Scientific containers for mobility of compute”, PLoS ONE 12 (2017) doi:10.1371/journal.pone.0177459.
  53. A. van den Oord et al., “WaveNet: A generative model for raw audio”, in 9th ISCA Speech Synthesis Workshop, p. 125. 2016. arXiv:1609.03499.
  54. E. A. Huerta et al., “Accelerated, scalable and reproducible AI-driven gravitational wave detection”, Nat. Astron. 5 (2021) 1062, doi:10.1038/s41550-021-01405-0, arXiv:2012.08545.
  55. A. Khan, E. A. Huerta, and P. Kumar, “AI and extreme scale computing to learn and infer the physics of higher order gravitational wave modes of quasi-circular, spinning, non-precessing black hole mergers”, Phys. Lett. B 835 (2022) 137505, doi:10.1016/j.physletb.2022.137505, arXiv:2112.07669.
  56. R. Bommasani et al., “On the opportunities and risks of foundation models”, 2021. arXiv:2108.07258.
  57. C. Lattner et al., “MLIR: Scaling compiler infrastructure for domain specific computation”, in 2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), p. 2. 2021. arXiv:2002.11054. doi:10.1109/CGO51591.2021.9370308.
  58. S. Druskat et al., “Citation File Format”, 8, 2021. doi:10.5281/zenodo.5171937, https://citation-file-format.github.io/.
  59. H. Qu and L. Gouskos, “ParticleNet: Jet tagging via particle clouds”, Phys. Rev. D 101 (2020) 056019, doi:10.1103/PhysRevD.101.056019, arXiv:1902.08570.
  60. P. W. Battaglia et al., “Interaction networks for learning about objects, relations and physics”, in Advances in Neural Information Processing Systems, D. Lee et al., eds., volume 29. Curran Associates, Inc., 12, 2016. arXiv:1612.00222.
  61. J. M. Duarte, B. Li, A. Roy, and R. Zhu, “Hbb Interaction Network: v0.1.1”, 2022. doi:10.5281/zenodo.7305227, https://github.com/FAIR4HEP/hbb_interaction_network.
  62. Moreno, E. A., Nguyen, T. Q., Vlimant, J.-R., Cerri, O., Newman, H. B., Periwal, A., Spiropulu, M., Duarte, J. M., Pierini, M., Zhu, R., Roy, A., Huerta, E. A., “FAIR Interaction Network Model for Higgs Boson Detection”. The Data and Learning Hub for Science (DLHub), 2022. doi:10.26311/5G6Q-D719.
  63. R. Chard et al., “DLHub: Model and data serving for science”, in 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS), p. 283, IEEE. 2019.
  64. R. Chard et al., “funcX: A federated function serving fabric for science”, in Proceedings of the 29th International Symposium on High-Performance Parallel and Distributed Computing, HPDC ’20, p. 65. Association for Computing Machinery, New York, NY, USA, 2020. arXiv:2005.04215. doi:10.1145/3369583.3392683.
  65. V. Kindratenko et al., “HAL: Computer system for scalable deep learning”, in Practice and Experience in Advanced Research Computing, p. 41. ACM, New York, NY, USA, 2020. doi:10.1145/3311790.3396649.
  66. T. Miller, “Explanation in artificial intelligence: Insights from the social sciences”, Artif. Intell. 267 (2019) 1, doi:10.1016/j.artint.2018.07.007.
  67. D. Gunning et al., “XAI—explainable artificial intelligence”, Sci. Robotics 4 (2019) eaay7120, doi:10.1126/scirobotics.aay7120.
  68. P. Linardatos, V. Papastefanopoulos, and S. Kotsiantis, “Explainable AI: A review of machine learning interpretability methods”, Entropy 23 (2020) 18, doi:10.3390/e23010018.
  69. G. Vilone and L. Longo, “Explainable artificial intelligence: A systematic review”, 2020. arXiv:2006.00093.
  70. M. Sahakyan, Z. Aung, and T. Rahwan, “Explainable artificial intelligence for tabular data: A survey”, IEEE Access 9 (2021) 135392, doi:10.1109/ACCESS.2021.3116481.
  71. H. Yuan, H. Yu, S. Gui, and S. Ji, “Explainability in graph neural networks: A taxonomic survey”, 2020. arXiv:2012.15445.
  72. Q.-s. Zhang and S.-C. Zhu, “Visual interpretability for deep learning: a survey”, Front. Inf. Technol. Electron. Eng. 19 (2018) 27, doi:10.1631/FITEE.1700808, arXiv:1802.00614.
  73. A. Khan et al., “Deep learning at scale for the construction of galaxy catalogs in the Dark Energy Survey”, Phys. Lett. B 795 (2019) 248, doi:10.1016/j.physletb.2019.06.009, arXiv:1812.02183.
  74. A. Khan et al., “Deep transfer learning at scale for cosmology”, 2018. https://www.youtube.com/watch?v=8-jcf1TZNdA.
  75. A. Khan, E. A. Huerta, and H. Zheng, “Interpretable AI forecasting for numerical relativity waveforms of quasicircular, spinning, nonprecessing binary black hole mergers”, Phys. Rev. D 105 (2022), no. 2, 024024, doi:10.1103/PhysRevD.105.024024, arXiv:2110.06968.
  76. M. S. Neubauer and A. Roy, “Explainable AI for High Energy Physics”, in 2022 Snowmass Summer Study. 2022. arXiv:2206.06632.
  77. P. Shanahan et al., “Snowmass 2021 Computational Frontier CompF03 Topical Group Report: Machine Learning”, in 2022 Snowmass Summer Study. 2022. arXiv:2209.07559.
  78. S. Miao, M. Liu, and P. Li, “Interpretable and generalizable graph learning via stochastic attention mechanism”, in Proceedings of the 39th International Conference on Machine Learning, K. Chaudhuri et al., eds., volume 162, p. 15524. 2022. arXiv:2201.12987.
  79. S. Miao, Y. Luo, M. Liu, and P. Li, “Interpretable geometric deep learning via learnable randomness injection”, in The Eleventh International Conference on Learning Representations. 2023. arXiv:2210.16966.
  80. D. Turvill, L. Barnby, B. Yuan, and A. Zahir, “A survey of interpretability of machine learning in accelerator-based high energy physics”, in 2020 IEEE/ACM International Conference on Big Data Computing, Applications and Technologies (BDCAT), p. 77. 2020. doi:10.1109/BDCAT50828.2020.00025.
  81. Y. S. Lai, D. Neill, M. Płoskoń, and F. Ringer, “Explainable machine learning of the underlying physics of high-energy particle collisions”, Phys. Lett. B 829 (2022) 137055, doi:10.1016/j.physletb.2022.137055, arXiv:2012.06582.
  82. G. Agarwal et al., “Explainable AI for ML jet taggers using expert variables and layerwise relevance propagation”, JHEP 05 (2021) 208, doi:10.1007/JHEP05(2021)208, arXiv:2011.13466.
  83. A. Khot, M. S. Neubauer, and A. Roy, “A Detailed Study of Interpretability of Deep Neural Network based Top Taggers”, 2022. arXiv:2210.04371.
  84. F. Mokhtar et al., “Explaining machine-learned particle-flow reconstruction”, in 4th Machine Learning and the Physical Sciences Workshop at the 35th Conference on Neural Information Processing Systems. 2021. arXiv:2111.12840.
  85. J. Tang, S. Alelyani, and H. Liu, “Feature selection for classification: A review”, in Data Classification: Algorithms and Applications, C. C. Aggarwal, ed., p. 37. Chapman and Hall/CRC, 2014. doi:10.1201/b17320.
  86. M. T. Ribeiro, S. Singh, and C. Guestrin, “Why should I trust you? Explaining the predictions of any classifier”, in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, p. 1135. 2016. arXiv:1602.04938. doi:10.1145/2939672.2939778.
  87. X.-w. Chen and M. Wasikowski, “FAST: A ROC-based feature selection metric for small samples and imbalanced data classification problems”, in Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data mining, p. 124. 2008. doi:10.1145/1401890.1401910.
  88. R. Wang and K. Tang, “Feature selection for maximizing the area under the ROC curve”, in 2009 IEEE International Conference on Data Mining Workshops, p. 400, IEEE. 2009. doi:10.1109/ICDMW.2009.25.
  89. A. J. Serrano et al., “Feature selection using ROC curves on classification problems”, in The 2010 International Joint Conference on Neural Networks (IJCNN), p. 1, IEEE. 2010. doi:10.1109/IJCNN.2010.5596692.
  90. P. E. Pope et al., “Explainability methods for graph convolutional neural networks”, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p. 10772. 2019. doi:10.1109/CVPR.2019.01103.
  91. A. Binder et al., “Layer-wise relevance propagation for deep neural network architectures”, in Information science and applications (ICISA) 2016, p. 913. Springer, 2016. arXiv:1604.00825. doi:10.1007/978-981-10-0557-2_87.
  92. G. Montavon et al., “Layer-wise relevance propagation: an overview”, in Explainable AI: Interpreting, explaining and visualizing deep learning, p. 193. Springer, 2019. doi:10.1007/978-3-030-28954-6_10.
  93. S. Bach et al., “On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation”, PloS One 10 (2015) e0130140, doi:10.1371/journal.pone.0130140.
  94. T. Schnake et al., “Higher-order explanations of graph neural networks via relevant walks”, IEEE Trans. Pattern Anal. Mach. Intell. (2021), no. 01, 1, doi:10.1109/TPAMI.2021.3115452, arXiv:2006.03589.
  95. E. Huerta et al., “FAIR for AI: An interdisciplinary, international, inclusive, and diverse community building perspective”, Sci. Data 10 (2023), no. 1, 487, doi:10.1038/s41597-023-02298-6, arXiv:2210.08973.
Citations (6)

Summary

We haven't generated a summary for this paper yet.