Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
143 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Differentiable Earth Mover's Distance for Data Compression at the High-Luminosity LHC (2306.04712v3)

Published 7 Jun 2023 in hep-ex, cs.LG, and physics.ins-det

Abstract: The Earth mover's distance (EMD) is a useful metric for image recognition and classification, but its usual implementations are not differentiable or too slow to be used as a loss function for training other algorithms via gradient descent. In this paper, we train a convolutional neural network (CNN) to learn a differentiable, fast approximation of the EMD and demonstrate that it can be used as a substitute for computing-intensive EMD implementations. We apply this differentiable approximation in the training of an autoencoder-inspired neural network (encoder NN) for data compression at the high-luminosity LHC at CERN. The goal of this encoder NN is to compress the data while preserving the information related to the distribution of energy deposits in particle detectors. We demonstrate that the performance of our encoder NN trained using the differentiable EMD CNN surpasses that of training with loss functions based on mean squared error.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (45)
  1. Y. Rubner, C. Tomasi, and L. J. Guibas, “The Earth mover’s distance as a metric for image retrieval”, Int. J. Comput. Vis. 40 (2000) 99, doi:10.1023/A:1026543900054.
  2. L. Hou, C.-P. Yu, and D. Samaras, “Squared earth mover’s distance-based loss for training deep neural networks”, 2016. arXiv:1611.05916.
  3. Q. Zhao, Z. Yang, and H. Tao, “Differential Earth mover’s distance with its applications to visual tracking”, IEEE Transactions on Pattern Analysis and Machine Intelligence 32 (2010), no. 2, 274–287, doi:10.1109/TPAMI.2008.299.
  4. W. Wang et al., “An optimal transportation approach for nuclear structure-based pathology”, IEEE Trans. Med. Imaging 30 (2011) 621, doi:10.1109/TMI.2010.2089693.
  5. P. T. Komiske, E. M. Metodiev, and J. Thaler, “Metric Space of Collider Events”, Phys. Rev. Lett. 123 (2019) 041801, doi:10.1103/PhysRevLett.123.041801, arXiv:1902.02346.
  6. P. T. Komiske et al., “Exploring the Space of Jets with CMS Open Data”, Phys. Rev. D 101 (2020) 034009, doi:10.1103/PhysRevD.101.034009, arXiv:1908.08542.
  7. C. Cesarotti and J. Thaler, “A Robust Measure of Event Isotropy at Colliders”, JHEP 08 (2020) 084, doi:10.1007/JHEP08(2020)084, arXiv:2004.06125.
  8. T. Cheng et al., “Variational autoencoders for anomalous jet tagging”, Phys. Rev. D 107 (2023) 016002, doi:10.1103/PhysRevD.107.016002, arXiv:2007.01850.
  9. M. C. Romão et al., “Use of a generalized energy mover’s distance in the search for rare phenomena at colliders”, Eur. Phys. J. C 81 (2021) doi:10.1140/epjc/s10052-021-08891-6.
  10. R. Kansal et al., “Particle cloud generation with message passing generative adversarial networks”, in Advances in Neural Information Processing Systems, volume 34. Curran Associates, Inc., 2021. arXiv:2106.11535.
  11. C. Zhang, Y. Cai, G. Lin, and C. Shen, “DeepEMD: Few-shot image classification with differentiable Earth mover’s distance and structured classifiers”, in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), p. 12200. 2020. arXiv:2003.06777. doi:10.1109/CVPR42600.2020.01222.
  12. T. Cai, J. Cheng, N. Craig, and K. Craig, “Linearized optimal transport for collider events”, Phys. Rev. D 102 (2020) 116019, doi:10.1103/PhysRevD.102.116019, arXiv:2008.08604.
  13. S. Tsan et al., “Particle Graph Autoencoders and Differentiable, Learned Energy Mover’s Distance”, in 4th Machine Learning and the Physical Sciences Workshop at the 35th Conference on Neural Information Processing Systems. 2021. arXiv:2111.12849.
  14. M. Cuturi, “Sinkhorn distances: Lightspeed computation of optimal transport”, in Advances in Neural Information Processing Systems, C. Burges et al., eds., volume 26. Curran Associates, Inc., 2013. arXiv:1306.0895.
  15. D. Ba et al., “SHAPER: Can You Hear the Shape of a Jet?”, arXiv:2302.12266.
  16. O. Kitouni, N. Nolte, and M. Williams, “Finding NEEMo: Geometric Fitting using Neural Estimation of the Energy Mover’s Distance”, in 5th Machine Learning and the Physical Sciences Workshop at the 36th Conference on Neural Information Processing Systems. 9, 2022. arXiv:2209.15624.
  17. O. Kitouni, N. Nolte, and M. Williams, “Robust and Provably Monotonic Networks”, in 4th Machine Learning and the Physical Sciences Workshop at the 35th Conference on Neural Information Processing Systems. 11, 2021. arXiv:2112.00038.
  18. A. J. Larkoski and J. Thaler, “A Spectral Metric for Collider Geometry”, arXiv:2305.03751.
  19. L. Gouskos et al., “Optimal transport for a novel event description at hadron colliders”, Phys. Rev. D 108 (2023) 096003, doi:10.1103/PhysRevD.108.096003, arXiv:2211.02029.
  20. J. Rabin, G. Peyré, J. Delon, and M. Bernot, “Wasserstein barycenter and its application to texture mixing”, in Scale Space and Variational Methods in Computer Vision, A. M. Bruckstein, B. M. ter Haar Romeny, A. M. Bronstein, and M. M. Bronstein, eds., p. 435. Springer Berlin Heidelberg, Berlin, Heidelberg, 2012.
  21. G. Di Guglielmo et al., “A Reconfigurable Neural Network ASIC for Detector Front-End Data Compression at the HL-LHC”, IEEE Trans. Nucl. Sci. 68 (2021) 2179, doi:10.1109/TNS.2021.3087100, arXiv:2105.01683.
  22. CMS Collaboration, “The Phase-2 upgrade of the CMS endcap calorimeter”, CMS Technical Design Report, 2017.
  23. CMS Collaboration, “The Phase-2 upgrade of the CMS Level-1 trigger”, CMS Technical Design Report CERN-LHCC-2020-004. CMS-TDR-021, 2020.
  24. CMS Collaboration, “Performance of the CMS Level-1 trigger in proton-proton collisions at s=𝑠absent\sqrt{s}=square-root start_ARG italic_s end_ARG = 13 TeV”, JINST 15 (2020) P10017, doi:10.1088/1748-0221/15/10/P10017, arXiv:2006.10165.
  25. ATLAS Collaboration, “Operation of the ATLAS trigger system in Run 2”, JINST 15 (2020) P10004, doi:10.1088/1748-0221/15/10/P10004, arXiv:2007.12539.
  26. ATLAS Collaboration, “Technical Design Report for the Phase-II Upgrade of the ATLAS TDAQ System”, ATLAS Technical Design Report CERN-LHCC-2017-020. ATLAS-TDR-029, 2017.
  27. C. N. Coelho et al., “Automatic heterogeneous quantization of deep neural networks for low-latency inference on the edge for particle detectors”, Nature Mach. Intell. 3 (2021) 675, doi:10.1038/s42256-021-00356-5, arXiv:2006.10159.
  28. R. Flamary et al., “POT: Python Optimal Transport”, J. Mach. Learn. Res. 22 (2021) 1.
  29. R. Shenoy et al., “CMS High Granularity Calorimeter Trigger Cell Simulated Dataset (Part 1)”, October, 2023. doi:10.5281/zenodo.8338608.
  30. R. Shenoy et al., “CMS High Granularity Calorimeter Trigger Cell Simulated Dataset (Part 2)”, October, 2023. doi:10.5281/zenodo.8408943.
  31. GEANT4 Collaboration, “Geant4—a simulation toolkit”, Nucl. Instrum. Meth. A 506 (2003) 250, doi:10.1016/S0168-9002(03)01368-8.
  32. F. Chollet et al., “Keras”. https://keras.io, 2015.
  33. M. Abadi et al., “TensorFlow: Large-scale machine learning on heterogeneous systems”, 2015. arXiv:1603.04467.
  34. S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift”, in Proceedings of the 32nd International Conference on Machine Learning, F. Bach and D. Blei, eds., volume 37, p. 448. PMLR, 2015. arXiv:1502.03167.
  35. V. Nair and G. E. Hinton, “Rectified linear units improve restricted Boltzmann machines”, in Proc. of the 27th Int. Conf. on Machine Learning (ICML), p. 807. 2010.
  36. X. Glorot, A. Bordes, and Y. Bengio, “Deep sparse rectifier neural networks”, in Proc. of the 14th Int. Conf. on Artificial Intelligence and Statistics (AISTATS), G. Gordon, D. Dunson, and M. Dudík, eds., volume 15, p. 315. 2011.
  37. P. J. Huber, “Robust estimation of a location parameter”, Ann. Math. Stat. 35 (1964) 73, doi:10.1214/aoms/1177703732.
  38. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization”, in 3rd Int. Conf. on Learning Representations (ICLR), Conference Track Proc., Y. Bengio and Y. LeCun, eds. 2015. arXiv:1412.6980.
  39. M. Crispim Romão et al., “Use of a generalized energy mover’s distance in the search for rare phenomena at colliders”, Eur. Phys. J. C 81 (2021), no. 2, 192, doi:10.1140/epjc/s10052-021-08891-6, arXiv:2004.09360.
  40. R. Kansal et al., “Particle Cloud Generation with Message Passing Generative Adversarial Networks”, in 35th Conference on Neural Information Processing Systems. 6, 2021. arXiv:2106.11535.
  41. R. Kansal et al., “Evaluating generative models in high energy physics”, Phys. Rev. D 107 (2023), no. 7, 076017, doi:10.1103/PhysRevD.107.076017, arXiv:2211.10295.
  42. V. Mikuni and B. Nachman, “Score-based generative models for calorimeter shower simulation”, Phys. Rev. D 106 (2022), no. 9, 092009, doi:10.1103/PhysRevD.106.092009, arXiv:2206.11898.
  43. E. Buhmann et al., “CaloClouds: Fast Geometry-Independent Highly-Granular Calorimeter Simulation”, arXiv:2305.04847.
  44. A. K. Sinha and F. Fleuret, “Deepemd: A transformer-based fast estimation of the earth mover’s distance”, in 2nd Annual Topology, Algebra, and Geometry in Machine Learning Workshop. 2023.
  45. MODE Collaboration, “Toward the end-to-end optimization of particle physics instruments with differentiable programming”, Rev. Phys. 10 (2023) 100085, doi:10.1016/j.revip.2023.100085, arXiv:2203.13818.

Summary

We haven't generated a summary for this paper yet.