Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
162 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Quark-versus-gluon tagging in CMS Open Data with CWoLa and TopicFlow (2312.03434v1)

Published 6 Dec 2023 in hep-ph

Abstract: We use the CMS Open Data to examine the performance of weakly-supervised learning for tagging quark and gluon jets at the LHC. We target $Z$+jet and dijet events as respective quark- and gluon-enriched mixtures and derive samples both from data taken in 2011 at 7 TeV, and from Monte Carlo. CWoLa and TopicFlow models are trained on real data and compared to fully-supervised classifiers trained on simulation. In order to obtain estimates for the discrimination power in real data, we consider three different estimates of the quark/gluon mixture fractions in the data. Compared to when the models are evaluated on simulation, we find reversed rankings for the fully- and weakly-supervised approaches. Further, these rankings based on data are robust to the estimate of the mixture fraction in the test set. Finally, we use TopicFlow to smooth statistical fluctuations in the small testing set, and to provide uncertainty on the performance in real data.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (70)
  1. P. Abreu et al. (DELPHI), Phys. Lett. B 449, 383 (1999), arXiv:hep-ex/9903073 .
  2. G. Abbiendi et al. (OPAL), Eur. Phys. J. C 11, 217 (1999), arXiv:hep-ex/9903027 .
  3. D. Acosta et al. (CDF), Phys. Rev. D 71, 112002 (2005), arXiv:hep-ex/0505013 .
  4. ATLAS Collaboration, Discrimination of Light Quark and Gluon Jets in p⁢p𝑝𝑝ppitalic_p italic_p collisions at s=8𝑠8\sqrt{s}=8square-root start_ARG italic_s end_ARG = 8 TeV with the ATLAS Detector, Tech. Rep. (CERN, Geneva, 2016).
  5. A. Collaboration, Quark versus Gluon Jet Tagging Using Charged Particle Multiplicity with the ATLAS Detector, Tech. Rep. (CERN, Geneva, 2017).
  6. ATLAS Collaboration, Quark versus Gluon Jet Tagging Using Jet Images with the ATLAS Detector, Tech. Rep. (CERN, Geneva, 2017).
  7. A. M. Sirunyan et al. (CMS), JHEP 10, 131, arXiv:1706.05868 [hep-ex] .
  8. CMS Collaboration, JHEP 01, 188, arXiv:2109.03340 [hep-ex] .
  9. G. Aad et al. (ATLAS),  (2023), arXiv:2308.00716 [hep-ex] .
  10. P. T. Komiske, E. M. Metodiev, and M. D. Schwartz, JHEP 01, 110, arXiv:1612.01551 [hep-ph] .
  11. T. Cheng 10.1007/s41781-018-0007-y (2017), arXiv:1711.02633 [hep-ph] .
  12. P. T. Komiske, E. M. Metodiev, and J. Thaler, JHEP 01, 121, arXiv:1810.05165 [hep-ph] .
  13. H. Qu and L. Gouskos, Phys. Rev. D 101, 056019 (2020), arXiv:1902.08570 [hep-ph] .
  14. V. Mikuni and F. Canelli, Eur. Phys. J. Plus 135, 463 (2020), arXiv:2001.05311 [physics.data-an] .
  15. F. A. Dreyer and H. Qu, JHEP 03, 052, arXiv:2012.08526 [hep-ph] .
  16. H. Qu, C. Li, and S. Qian,   (2022), arXiv:2202.03772 [hep-ph] .
  17. M. He and D. Wang,   (2023), arXiv:2307.04723 [hep-ph] .
  18. F. A. Dreyer, G. Soyez, and A. Takacs, JHEP 08, 177, arXiv:2112.09140 [hep-ph] .
  19. J. Mo, F. J. Tackmann, and W. J. Waalewijn, Eur. Phys. J. C 77, 770 (2017), arXiv:1708.00867 [hep-ph] .
  20. J. Gallicchio and M. D. Schwartz, JHEP 10, 103, arXiv:1104.1175 [hep-ph] .
  21. E. M. Metodiev, B. Nachman, and J. Thaler, JHEP 10, 174, arXiv:1708.02949 [hep-ph] .
  22. P. T. Komiske, S. Kryhin, and J. Thaler, Phys. Rev. D 106, 094021 (2022), arXiv:2205.04459 [hep-ph] .
  23. E. Alvarez, M. Spannowsky, and M. Szewc, Front. Artif. Intell. 5, 852970 (2022), arXiv:2112.11352 [hep-ph] .
  24. E. M. Metodiev and J. Thaler, Phys. Rev. Lett. 120, 241602 (2018), arXiv:1802.00008 [hep-ph] .
  25. P. T. Komiske, E. M. Metodiev, and J. Thaler, JHEP 11, 059, arXiv:1809.01140 [hep-ph] .
  26. ATLAS Collaboration, Phys. Rev. D 100, 052011 (2019), arXiv:1906.09254 [hep-ex] .
  27. J. Brewer, J. Thaler, and A. P. Turner, Phys. Rev. C 103, L021901 (2021), arXiv:2008.08596 [hep-ph] .
  28. M. LeBlanc, B. Nachman, and C. Sauer, JHEP 02, 150, arXiv:2206.10642 [hep-ph] .
  29. A. J. Larkoski and E. M. Metodiev, JHEP 10, 014, arXiv:1906.01639 [hep-ph] .
  30. I. W. Stewart and X. Yao, JHEP 09, 120, arXiv:2203.14980 [hep-ph] .
  31. A. Takacs and K. Tywoniuk, JHEP 10, 038, arXiv:2103.14676 [hep-ph] .
  32. M. J. Dolan and A. Ore, Phys. Rev. D 107, 114003 (2023), arXiv:2211.16053 [hep-ph] .
  33. CERN Open Data Portal, http://opendata.cern.ch.
  34. S. Arias, E. Cuautle, and H. León Vargas, Phys. Scripta 98, 035305 (2023).
  35. S. Paktinat Mehdiabadi and A. Fahim, J. Phys. G46, 095003 (2019), arXiv:1907.08842 [hep-ph] .
  36. C. G. Lester and M. Schott, JHEP 12, 120, arXiv:1904.11195 [hep-ex] .
  37. P. Baroň, M. H. Seymour, and A. Siódmok,   (2023), arXiv:2307.15378 [hep-ph] .
  38. CMS Collaboration, Jet primary dataset in AOD format from RunA of 2011 (/Jet/Run2011A-12Oct2013-v1/AOD) (2016a), CERN Open Data Portal.
  39. T. Sjostrand, S. Mrenna, and P. Z. Skands, JHEP 05, 026, arXiv:hep-ph/0603175 .
  40. CMS Collaboration, Simulated dataset QCD_Pt-15to30_TuneZ2_7TeV_pythia6 in AODSIM format for 2011 collision data (SM Exclusive), CERN Open Data Portal (2016b).
  41. CMS Collaboration, Simulated dataset QCD_Pt-30to50_TuneZ2_7TeV_pythia6 in AODSIM format for 2011 collision data (SM Exclusive), CERN Open Data Portal (2016c).
  42. CMS Collaboration, Simulated dataset QCD_Pt-50to80_TuneZ2_7TeV_pythia6 in AODSIM format for 2011 collision data (SM Exclusive), CERN Open Data Portal (2016d).
  43. CMS Collaboration, Simulated dataset QCD_Pt-80to120_TuneZ2_7TeV_pythia6 in AODSIM format for 2011 collision data (SM Exclusive), CERN Open Data Portal (2016e).
  44. CMS Collaboration, Simulated dataset QCD_Pt-120to170_TuneZ2_7TeV_pythia6 in AODSIM format for 2011 collision data (SM Exclusive), CERN Open Data Portal (2016f).
  45. CMS Collaboration, Simulated dataset QCD_Pt-170to300_TuneZ2_7TeV_pythia6 in AODSIM format for 2011 collision data (SM Exclusive), CERN Open Data Portal (2016g).
  46. CMS Collaboration, Simulated dataset QCD_Pt-300to470_TuneZ2_7TeV_pythia6 in AODSIM format for 2011 collision data (SM Exclusive), CERN Open Data Portal (2016h).
  47. CMS Collaboration, Simulated dataset QCD_Pt-470to600_TuneZ2_7TeV_pythia6 in AODSIM format for 2011 collision data (SM Exclusive), CERN Open Data Portal (2016i).
  48. CMS Collaboration, Simulated dataset QCD_Pt-600to800_TuneZ2_7TeV_pythia6 in AODSIM format for 2011 collision data (SM Exclusive), CERN Open Data Portal (2016j).
  49. CMS Collaboration, Simulated dataset QCD_Pt-800to1000_TuneZ2_7TeV_pythia6 in AODSIM format for 2011 collision data (SM Exclusive), CERN Open Data Portal (2016k).
  50. CMS Collaboration, Simulated dataset QCD_Pt-1000to1400_TuneZ2_7TeV_pythia6 in AODSIM format for 2011 collision data (SM Exclusive), CERN Open Data Portal (2016l).
  51. CMS Collaboration, Simulated dataset QCD_Pt-1400to1800_TuneZ2_7TeV_pythia6 in AODSIM format for 2011 collision data (SM Exclusive), CERN Open Data Portal (2016m).
  52. CMS Collaboration, Simulated dataset QCD_Pt-1800_TuneZ2_7TeV_pythia6 in AODSIM format for 2011 collision data (SM Exclusive), CERN Open Data Portal (2016n).
  53. CMS Collaboration, JINST 6, P11002, arXiv:1107.4277 [physics.ins-det] .
  54. CMS Collaboration, DoubleMu primary dataset in AOD format from RunA of 2011 (/DoubleMu/Run2011A-12Oct2013-v1/AOD) (2016o), CERN Open Data Portal.
  55. CMS Collaboration, Simulated dataset DYJetsToLL_M-10to50_TuneZ2_7TeV_pythia6 in AODSIM format for 2011 collision data (SM Inclusive), CERN Open Data Portal (2016p).
  56. CMS Collaboration, JINST 13 (06), P06015, arXiv:1804.04528 [physics.ins-det] .
  57. S. Bright-Thonney and B. Nachman, JHEP 03, 098, arXiv:1810.05653 [hep-ph] .
  58. P. T. Komiske, E. M. Metodiev, and J. Thaler, JHEP 04, 013, arXiv:1712.07124 [hep-ph] .
  59. CMS Collaboration, Performance of quark/gluon discrimination in 8 TeV pp data, Tech. Rep. (CERN, Geneva, 2013).
  60. CMS Collaboration, Jet algorithms performance in 13 TeV data, Tech. Rep. (CERN, Geneva, 2017).
  61. S. Diefenbacher, V. Mikuni, and B. Nachman,   (2023), arXiv:2308.12339 [physics.ins-det] .
  62. CMS Offline Software, https://github.com/cms-sw/cmssw (2004).
  63. XRootD software framework, https://xrootd.slac.stanford.edu (2012).
  64. CMS Open Data Validation Project, https://github.com/cms-opendata-validation (2018).
  65. CMS Jet Tuple production 2011, https://github.com/cms-opendata-validation/2011-jet-inclusivecrosssection-ntupleproduction (2018).
  66. MODProducer, https://github.com/tripatheea/MODProducer (2017).
  67. Guide to the CMS condition database, http://opendata.cern.ch/docs/cms-guide-for-condition-database/ (2014).
  68. CMS Collaboration, Commissioning of the Particle-flow Event Reconstruction with the first LHC collisions recorded in the CMS detector, Tech. Rep. (2010).
  69. CMS Collaboration, Pileup Removal Algorithms, Tech. Rep. (CERN, Geneva, 2014).
  70. EnergyFlow Package, https://energyflow.network/ (2017).
Citations (4)

Summary

We haven't generated a summary for this paper yet.