Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

The LHC Olympics 2020: A Community Challenge for Anomaly Detection in High Energy Physics (2101.08320v1)

Published 20 Jan 2021 in hep-ph, hep-ex, and physics.data-an

Abstract: A new paradigm for data-driven, model-agnostic new physics searches at colliders is emerging, and aims to leverage recent breakthroughs in anomaly detection and machine learning. In order to develop and benchmark new anomaly detection methods within this framework, it is essential to have standard datasets. To this end, we have created the LHC Olympics 2020, a community challenge accompanied by a set of simulated collider events. Participants in these Olympics have developed their methods using an R&D dataset and then tested them on black boxes: datasets with an unknown anomaly (or not). This paper will review the LHC Olympics 2020 challenge, including an overview of the competition, a description of methods deployed in the competition, lessons learned from the experience, and implications for data analyses with future datasets as well as future colliders.

Citations (147)

Summary

  • The paper presents a community challenge that benchmarks machine learning techniques for model-agnostic anomaly detection in simulated LHC events.
  • It evaluates unsupervised, weakly supervised, and semi-supervised methods to identify unconventional signals amid high-dimensional collider data.
  • The study offers practical insights that can refine detection strategies and advance robust searches for Beyond Standard Model phenomena.

An Overview of The LHC Olympics 2020 Paper

The paper "The LHC Olympics 2020: A Community Challenge for Anomaly Detection in High Energy Physics" presents a collaborative effort by researchers in the field of high energy physics to develop and benchmark new methods of anomaly detection using machine learning. The challenge addressed the necessity for model-agnostic searches for new physics at colliders, specifically the Large Hadron Collider (LHC), due to the limitations imposed by the model-dependent search paradigms that dominate current research programs.

Motivation and Context

High energy physics necessitates searches for Beyond Standard Model (BSM) phenomena, prompted by unresolved phenomena like dark matter, dark energy, neutrino masses, and the baryon-anti-baryon asymmetry. Traditional searches are largely model-dependent, focusing on known signatures and involving high-fidelity simulations to distinguish signal from the expected Standard Model events. These methods could potentially overlook unconventional anomalies in the complex LHC data.

The rise of machine learning, particularly its applications in anomaly detection, provides a promising approach to addressing these challenges. Unsupervised, semi-supervised, or weakly supervised learning techniques can detect signals deviating from expected behavior without substantial reliance on preconceived models.

The LHC Olympics Challenge

The LHC Olympics 2020 was designed as a platform to advance anomaly detection methods. It delivered simulated collider events in the form of black boxes that contained either standard model events, potential anomalies, or a mix thereof. Researchers were prompted to analyze these data sets to identify anomalies, ascertain their properties, and assess how many signal events were present.

To facilitate this, the research distributed datasets publicly, and participants used these to innovate and refine their anomaly detection methods. As part of the challenge, participants returned their results, which were analyzed and assessed during workshops to gauge performance and derive insights.

Methods Explored

Participants deployed a range of methods divided into three primary categories:

  1. Unsupervised Methods: Techniques that rely solely on data inputs without label information to identify regions within data corresponding to potential anomalies.
  2. Weakly Supervised Methods: Strategies using noisy labels from coarse background-signal distinctions to learn enriched signal characteristics within data.
  3. (Semi)-Supervised Methods: Algorithms employing simulations as part of their training strategy, focusing on capturing specific signal-like behaviors.

Results and Conclusions

The challenge unveiled diverse anomaly detection strategies, revealing strengths and areas for further improvement across numerous approaches. While some methods managed to correctly identify anomalous signatures within the datasets, others provided valuable lessons on the issues that can arise through statistical processing anomalies, highlighting areas where existing strategies can be refined or expanded, especially in the face of high-dimensional data.

The paper emphasizes the importance of these collaborative challenges, underscoring how innovation through community-led efforts could lead to robust, model-agnostic detection methods capable of discovering previously overlooked physics phenomena. Going forward, the blend of theory and experiment embedded within machine learning presents a strategic approach towards heightened anomaly detection sensitivity.

Future Prospects

The paper speculates about future developments in the field, including potential applications in the ongoing LHC run and other colliders. It envisions the integration of these advanced anomaly detection strategies into broader research programs, involving both theory and practical collider data applications, to ensure an inclusive and comprehensive exploration of particle physics landscapes.

In summary, the LHC Olympics 2020 paper serves as a critical evaluation of novel anomaly detection methods, acting as a catalyst for the future trajectory of high energy physics and encouraging broader adoption of data-driven methodologies in an increasingly complex scientific domain.

X Twitter Logo Streamline Icon: https://streamlinehq.com