Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Particle Transformer for Jet Tagging (2202.03772v3)

Published 8 Feb 2022 in hep-ph, cs.LG, hep-ex, and physics.data-an

Abstract: Jet tagging is a critical yet challenging classification task in particle physics. While deep learning has transformed jet tagging and significantly improved performance, the lack of a large-scale public dataset impedes further enhancement. In this work, we present JetClass, a new comprehensive dataset for jet tagging. The JetClass dataset consists of 100 M jets, about two orders of magnitude larger than existing public datasets. A total of 10 types of jets are simulated, including several types unexplored for tagging so far. Based on the large dataset, we propose a new Transformer-based architecture for jet tagging, called Particle Transformer (ParT). By incorporating pairwise particle interactions in the attention mechanism, ParT achieves higher tagging performance than a plain Transformer and surpasses the previous state-of-the-art, ParticleNet, by a large margin. The pre-trained ParT models, once fine-tuned, also substantially enhance the performance on two widely adopted jet tagging benchmarks. The dataset, code and models are publicly available at https://github.com/jet-universe/particle_transformer.

Citations (84)

Summary

  • The paper introduces the Particle Transformer (ParT) model, which exploits pairwise particle interactions to greatly enhance jet tagging accuracy.
  • Its novel dataset of 100 million jets across 10 classes provides an unprecedented foundation for deep learning research in high-energy physics.
  • Experiments show ParT surpasses ParticleNet by achieving 86.1% accuracy, setting a new benchmark for jet tagging applications.

Overview of "Particle Transformer for Jet Tagging"

The paper examines advancements in jet tagging—a critical classification challenge within the field of high-energy particle physics—by introducing both a novel dataset and an innovative model architecture. Named the Particle Transformer (ParT), this approach leverages a Transformer-based architecture to exploit pairwise particle interactions, aiming to enhance jet tagging performance which has been previously transformed by deep learning methodologies.

The Dataset

The authors present a new dataset, which is remarkably comprehensive, consisting of 100 million simulated jets across 10 distinct classes—a scale approximately two orders of magnitude beyond existing public datasets. This extensive dataset incorporates several types of jets that have not been addressed in jet tagging literature so far, broadening the scope for applications at cutting-edge facilities such as the CERN LHC.

In detail, the dataset includes background jets originating from light quarks or gluons and signal jets deriving from fundamental particles like the top quark, and the W, Z, or Higgs bosons. Each jet is represented in a cloud of particles, with each particle characterized by features across three categories: kinematics, particle identification, and trajectory displacement. This refined level of detail is critical because it captures the complex interactions that occur within a jet resulting from radiative processes, which traditional models have struggled to encompass fully.

Methodology: Particle Transformer

The ParT model architecture distinguishes itself by capitalizing on the pairwise interactions between particles through an augmented attention mechanism within a Transformer framework, while eschewing the need for additional positional encoding—making it suitable for permutation-invariant data like particle jets. The architecture comprises particle attention blocks and class attention blocks, enabling it to incorporate and learn complex particle interaction patterns effectively.

Experiments reveal that ParT surpasses the state-of-the-art ParticleNet by significant margins across several metrics. For instance, the ParT model achieved higher classification accuracy (86.1%) compared to competing models, demonstrating its capability to effectively separate signal jets from background ones. The result implies a potential leap in discovery potential at large particle colliders, enabling more precise identification of events featuring novel physical processes.

Evaluation and Comparison

The newly proposed ParT is evaluated thoroughly against well-established baselines like PFN, P-CNN, and ParticleNet. Metrics such as accuracy and area under the ROC curve (AUC), as well as TPR-based background rejection, substantiate the considerable performance gain achieved by ParT over these baseline models. Furthermore, the authors perform ablation studies, validating the advantages imparted by their novel P-MHA modules which harness interactions among particles to enhance model expressiveness and predictive accuracy.

Additionally, the paper assesses the influence of dataset size on model training. They emphasize that models trained on larger datasets markedly outperform those on smaller subsets, underscoring the critical role of big data in furthering the capabilities of deep learning in particle physics.

Broader Implications and Future Work

The introduction of this extensive dataset, alongside the outperforming ParT architecture, anticipates considerable implications for the field. Notably, the enhanced performance on jet tagging sets a precedent for future machine learning applications in identifying and classifying elementary particles, thereby potentially accelerating discoveries in high-energy physics.

In the theoretical landscape, ParT highlights the promising avenue of embedding physics-based insights into attention mechanisms, offering a template for future architectures to incorporate domain-specific knowledge for enhanced model performance.

Conclusion

The Particle Transformer for Jet Tagging offers a substantive contribution through both its expansive dataset and a novel Transformer-based architecture that produces state-of-the-art results. These efforts collectively push the boundaries of what is feasible in machine learning applications within particle physics, laying the groundwork for further exploration and optimization in utilizing deep learning for complex scientific endeavors. The public availability of the dataset and code invites further research, supporting the community in advancing the science of jet tagging and particle identification.

Github Logo Streamline Icon: https://streamlinehq.com