2000 character limit reached
Ultrafast jet classification on FPGAs for the HL-LHC (2402.01876v2)
Published 2 Feb 2024 in hep-ex, cs.LG, and physics.ins-det
Abstract: Three machine learning models are used to perform jet origin classification. These models are optimized for deployment on a field-programmable gate array device. In this context, we demonstrate how latency and resource consumption scale with the input size and choice of algorithm. Moreover, the models proposed here are designed to work on the type of data and under the foreseen conditions at the CERN LHC during its high-luminosity phase. Through quantization-aware training and efficient synthetization for a specific field programmable gate array, we show that $O(100)$ ns inference of complex architectures such as Deep Sets and Interaction Networks is feasible at a relatively low computational resource cost.
- ATLAS Collaboration, “The ATLAS Experiment at the CERN Large Hadron Collider”, JINST 3 (2008) S08003, doi:10.1088/1748-0221/3/08/S08003.
- D. Contardo et al., “Technical Proposal for the Phase-II Upgrade of the CMS Detector”, CMS Technical Proposal, 2015. doi:10.17181/CERN.VU8I.D59J.
- CMS Collaboration, “The Phase-2 Upgrade of the CMS Level-1 Trigger”, CMS Technical Design Report, 2020.
- CMS Collaboration, “Particle-flow reconstruction and global event description with the CMS detector”, JINST 12 (2017) P10003, doi:10.1088/1748-0221/12/10/P10003, arXiv:1706.04965.
- M. Cacciari, G. P. Salam, and G. Soyez, “The anti-kTsubscript𝑘Tk_{\mathrm{T}}italic_k start_POSTSUBSCRIPT roman_T end_POSTSUBSCRIPT jet clustering algorithm”, JHEP 04 (2008) 063, doi:10.1088/1126-6708/2008/04/063, arXiv:0802.1189.
- M. Cacciari, G. P. Salam, and G. Soyez, “FastJet User Manual”, Eur. Phys. J. C 72 (2012) 1896, doi:10.1140/epjc/s10052-012-1896-2, arXiv:1111.6097.
- E. A. Moreno et al., “JEDI-net: a jet identification algorithm based on interaction networks”, Eur. Phys. J. C 80 (2020) 58, doi:10.1140/epjc/s10052-020-7608-4, arXiv:1908.05318.
- H. Qu and L. Gouskos, “ParticleNet: Jet Tagging via Particle Clouds”, Phys. Rev. D 101 (2020), no. 5, 056019, doi:10.1103/PhysRevD.101.056019, arXiv:1902.08570.
- D. Guest et al., “Jet Flavor Classification in High-Energy Physics with Deep Neural Networks”, Phys. Rev. D 94 (2016), no. 11, 112002, doi:10.1103/PhysRevD.94.112002, arXiv:1607.08633.
- G. Kasieczka et al., “The Machine Learning landscape of top taggers”, SciPost Phys. 7 (2019) 014, doi:10.21468/SciPostPhys.7.1.014, arXiv:1902.09914.
- J. Duarte et al., “Fast inference of deep neural networks in FPGAs for particle physics”, JINST 13 (2018), no. 07, doi:10.1088/1748-0221/13/07/P07027, arXiv:1804.06913.
- Fast Machine Learning Lab Collaboration, “hls4ml”, 2018. https://fastmachinelearning.org/hls4ml/.
- M. Courbariaux, Y. Bengio, and J.-P. David, “BinaryConnect: Training deep neural networks with binary weights during propagations”, in Advances in Neural Information Processing Systems, C. Cortes et al., eds., volume 28, p. 3123. Curran Associates, Inc., 2015. arXiv:1511.00363.
- S. Han, H. Mao, and W. J. Dally, “Deep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding”, in 4th International Conference on Learning Representations, San Juan, Puerto Rico, May 2, 2016, Y. Bengio and Y. LeCun, eds. 2016. arXiv:1510.00149.
- C. Coelho, “QKeras”, 2019. https://github.com/google/qkeras.
- P. T. Komiske, E. M. Metodiev, and J. Thaler, “Energy Flow Networks: Deep Sets for Particle Jets”, JHEP 01 (2019) 121, doi:10.1007/JHEP01(2019)121, arXiv:1810.05165.
- H. Qu, C. Li, and S. Qian, “Particle Transformer for Jet Tagging”, in Proceedings of the 39th International Conference on Machine Learning, K. Chaudhuri et al., eds., volume 162, p. 18281. 2022. arXiv:2202.03772.
- Y. Iiyama et al., “Distance-Weighted Graph Neural Networks on FPGAs for Real-Time Particle Reconstruction in High Energy Physics”, Front. Big Data 3 (2020) 598927, doi:10.3389/fdata.2020.598927, arXiv:2008.03601.
- A. Bogatskiy, T. Hoffman, D. W. Miller, and J. T. Offermann, “PELICAN: Permutation Equivariant and Lorentz Invariant or Covariant Aggregator Network for Particle Physics”, arXiv:2211.00454.
- A. Bogatskiy et al., “Explainable Equivariant Neural Networks for Particle Physics: PELICAN”, arXiv:2307.16506.
- S. Gong et al., “An efficient Lorentz equivariant graph neural network for jet tagging”, JHEP 07 (2022) 030, doi:10.1007/JHEP07(2022)030, arXiv:2201.08187.
- M. Zaheer et al., “Deep sets”, in Advances in Neural Information Processing Systems, I. Guyon et al., eds., volume 30. Curran Associates, Inc., 2017. arXiv:1703.06114.
- P. W. Battaglia et al., “Relational inductive biases, deep learning, and graph networks”, 2018. arXiv:1806.01261.
- M. M. Bronstein, J. Bruna, T. Cohen, and P. Velvčković, “Geometric deep learning: Grids, groups, graphs, geodesics, and gauges”, 2021.
- J. Zhou et al., “Graph neural networks: A review of methods and applications”, AI Open 1 (2021) 57, doi:10.1016/j.aiopen.2021.01.001, arXiv:1812.08434.
- J. Shlomi, P. Battaglia, and J.-R. Vlimant, “Graph neural networks in particle physics”, Mach. Learn. Sci. Tech. 2 (2021) 021001, doi:10.1088/2632-2153/abbf9a, arXiv:2007.13681.
- K. Guo et al., “A survey of FPGA-based neural network inference accelerators”, ACM Trans. Reconfigurable Technol. Syst. 12l (2018) doi:10.1145/3289185.
- S. I. Venieris, A. Kouris, and C. Bouganis, “Toolflows for mapping convolutional neural networks on fpgas: A survey and future directions”, ACM Comput. Surv. 51 (2018), no. 3, doi:10.1145/3186332, arXiv:1803.05900.
- S. Summers et al., “Fast inference of boosted decision trees in FPGAs for particle physics”, JINST 15 (2020), no. 05, P05026, doi:10.1088/1748-0221/15/05/p05026, arXiv:2002.02534.
- T. M. Hong et al., “Nanosecond machine learning event classification with boosted decision trees in FPGA for high energy physics”, JINST 16 (2021) P08016, doi:10.1088/1748-0221/16/08/P08016, arXiv:2104.03408.
- B. Carlson, Q. Bayer, T. M. Hong, and S. Roche, “Nanosecond machine learning regression with deep boosted decision trees in FPGA for high energy physics”, JINST 17 (2022) P09039, doi:10.1088/1748-0221/17/09/P09039, arXiv:2207.05602.
- S. Roche et al., “Nanosecond anomaly detection with decision trees for high energy physics and real-time application to exotic Higgs decays”, arXiv:2304.03836.
- Z. Que et al., “LL-GNN: Low Latency Graph Neural Networks on FPGAs for High Energy Physics”, ACM Trans. Embed. Comput. Syst. (2024) doi:10.1145/3640464.
- B. Bhattacherjee, P. Konar, V. S. Ngairangbam, and P. Solanki, “LLPNet: Graph Autoencoder for Triggering Light Long-Lived Particles at HL-LHC”, arXiv:2308.13611.
- J. M. Duarte et al., “hls4ml LHC jet dataset (150 particles)”, 2020. doi:10.5281/zenodo.3602260.
- M. Abadi et al., “Tensorflow: Large-scale machine learning on heterogeneous systems”, 2015. Software available from tensorflow.org. https://www.tensorflow.org/.
- F. Chollet et al., “Keras”, 2015. Software available from tensorflow.org. https://github.com/fchollet/keras.
- P. W. Battaglia et al., “Interaction networks for learning about objects, relations and physics”, in Advances in Neural Information Processing Systems, D. Lee et al., eds., volume 29. Curran Associates, Inc., 2016. arXiv:1612.00222.
- V. Nair and G. E. Hinton, “Rectified linear units improve restricted Boltzmann machines”, in Proc. of the 27th Int. Conf. on Machine Learning (ICML), p. 807. 2010.
- X. Glorot, A. Bordes, and Y. Bengio, “Deep sparse rectifier neural networks”, in Proc. of the 14th Int. Conf. on Artificial Intelligence and Statistics (AISTATS), G. Gordon, D. Dunson, and M. Dudík, eds., volume 15, p. 315. 2011.
- D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization”, 2017.
- M. Zhu and S. Gupta, “To prune, or not to prune: exploring the efficacy of pruning for model compression”, in 6th International Conference on Learning Representations, Workshop Track Proceedings. 2018. arXiv:1710.01878.
- C. N. Coelho et al., “Automatic heterogeneous quantization of deep neural networks for low-latency inference on the edge for particle detectors”, Nature Mach. Intell. 3 (2021), no. 8, 675, doi:10.1038/s42256-021-00356-5, arXiv:2006.10159.
- Z. Que, A. Sznajder, J. Duarte, and P. Odagiu, “l1-jet-id”, 2024. doi:10.5281/zenodo.10553804, https://github.com/fastmachinelearning/l1-jet-id.
- Y. LeCun, J. S. Denker, and S. A. Solla, “Optimal brain damage”, in Advances in Neural Information Processing Systems, D. S. Touretzky, ed., volume 2, p. 598. Morgan-Kaufmann, 1990.
- J. Frankle and M. Carbin, “The lottery ticket hypothesis: Finding sparse, trainable neural networks”, in 7th International Conference on Learning Representations. 2019. arXiv:1803.03635.
- A. Renda, J. Frankle, and M. Carbin, “Comparing rewinding and fine-tuning in neural network pruning”, in 8th International Conference on Learning Representations, Addis Ababa, Ethiopia, April 26, 2020. 2020. arXiv:2003.02389. https://openreview.net/forum?id=S1gSj0NKvB.
- H. Zhou, J. Lan, R. Liu, and J. Yosinski, “Deconstructing lottery tickets: Zeros, signs, and the supermask”, in Advances in Neural Information Processing Systems, H. Wallach et al., eds., volume 32, p. 3597. Curran Associates, Inc., 2019. arXiv:1905.01067.
- D. Blalock, J. J. G. Ortiz, J. Frankle, and J. Guttag, “What is the state of neural network pruning?”, in Proceedings of Machine Learning and Systems, I. Dhillon, D. Papailiopoulos, and V. Sze, eds., volume 2, p. 129. 2020. arXiv:2003.03033.