What is the Machine Learning?

Published 28 Sep 2017 in hep-ph, physics.data-an, and stat.ML | (1709.10106v2)

Abstract: Applications of machine learning tools to problems of physical interest are often criticized for producing sensitivity at the expense of transparency. To address this concern, we explore a data planing procedure for identifying combinations of variables -- aided by physical intuition -- that can discriminate signal from background. Weights are introduced to smooth away the features in a given variable(s). New networks are then trained on this modified data. Observed decreases in sensitivity diagnose the variable's discriminating power. Planing also allows the investigation of the linear versus non-linear nature of the boundaries between signal and background. We demonstrate the efficacy of this approach using a toy example, followed by an application to an idealized heavy resonance scenario at the Large Hadron Collider. By unpacking the information being utilized by these algorithms, this method puts in context what it means for a machine to learn.

Abstract PDF Upgrade to Chat

Authors (3)

Citations (38)

View on Semantic Scholar

Summary

The paper introduces data planing, a methodology that degrades key ML features to reveal their discriminative power in both toy models and realistic particle physics scenarios.
The technique was validated by comparing neural network performance, showing AUC drops from 0.812 and 0.989 to near chance levels when critical features were planed away.
Findings enhance model transparency and guide targeted feature engineering, fostering deeper insights into ML applications in high-energy physics.

A Study on Data Planing for Machine Learning Transparency in Particle Physics

The paper, "What is the Machine Learning?" by Spencer Chang, Timothy Cohen, and Bryan Ostdiek, investigates a technique termed data planing aimed at addressing the black-box nature of ML methods when applied to physical problems. The primary goal is to diagnose and unpack the discriminative features used by ML algorithms, thereby enhancing their interpretability. The paper provides concrete illustrative examples using both toy models and a particle physics scenario involving a hypothetical heavy resonance detectable at the Large Hadron Collider (LHC).

Summary of the Techniques and Results

The authors first introduce data planing through a toy model characterized by both linear and nonlinear features. By planing—smoothing away—the dependence on certain variables, the authors systematically degrade the performance of neural networks. They demonstrate that the extent of performance drop reveals the discriminative power of the planed variables.

The methodology is validated through a supervised learning task designed to distinguish signal from background events. Initially, training a neural network on unaltered toy dataset results in an area under the receiver operating characteristic curve (AUC) of 0.812 for a deep network, versus 0.612 for a linear model. By incorporating non-linear features like a radius variable (r), both models' performance is aligned. This initial investigation illustrates that planing away significant features isolates the variables responsible for the model's discriminative capacity.

In a more realistic setting, the authors apply their technique to a particle physics problem involving the differentiation of a hypothetical $Z'$ boson signal from the standard model photon background. The use case is beneficial as the invariant mass of the decay products is known to be a prime discriminative feature. The results are notably conclusive:

When features are left unplaned, the deep network yields an AUC close to 0.989, signifying effective discrimination.
Planing the invariant mass reduces the AUC to around 0.500 for both linear and deep networks, confirming the importance of this variable in discrimination.

The authors advance their investigation by examining different coupling scenarios (vector vs. left-handed couplings) for the $Z'$ boson. The results indicate that additional discriminative features beyond invariant mass are at play in the left-handed coupling scenario—an inference substantiated by inspecting rapidity distributions (y). After planing in the invariant mass and rapidity difference (\Delta |y|), the deep network's AUC approaches 0.532, suggesting the coupling-dependent discriminative information is largely encompassed by these variables.

Practical and Theoretical Implications

The study successfully demonstrates that data planing is an effective technique for quantifying and understanding the key discriminative features within the data when using ML techniques. This has several practical implications:

Enhanced Transparency: Physicists and researchers can infer which variables an ML algorithm is leveraging, aligning the black-box outputs with understandable, physically meaningful inputs.
Model Optimization: By diagnosing the specific features driving model decisions, efforts in feature engineering and model optimization can be more targeted and efficient.
Physics Insight: Identifying and understanding data characteristics leveraged by ML models can provide new insights into physical phenomena, guiding further theoretical and experimental investigations.

Speculation on Future Developments

The authors' methodological contributions open several avenues for future work. Expanding the study of data planing to include more complex signals and higher-dimensional spaces could uncover new discriminative features and improve model interpretability further. Additionally, integrating advanced smoothing techniques or designing networks specifically tailored for weight calculation could enhance planing precision.

Investigating the differential impact of planing on other ML architectures, such as convolutional neural networks (CNNs) and transformers, could provide a broader applicability spectrum. Furthermore, the authors propose exploring systematic tests of numerous Lorentz invariants to uncover new high-level variables that ML algorithms utilize efficiently.

Conclusion

The paper offers an insightful, methodologically sound approach to bridging the gap between ML's high sensitivity and the need for transparency in physical applications. By introducing, validating, and discussing the implications of data planing, the authors provide a valuable tool for the scientific community. This work fundamentally aids in the interpretation of ML models in particle physics, fostering a deeper understanding and more robust application of these powerful algorithms.

Markdown Report Issue