Reweighting and Analysing Event Generator Systematics by Neural Networks on High-Level Features (2503.01452v1)
Abstract: The state-of-the-art deep learning (DL) models for jet classification use jet constituent information directly, improving performance tremendously. This draws attention to interpretability, namely, the decision-making process, correlations contributing to the classification, and high-level features (HLFs) representing the difference between signal and background. We address the interpretability issue using a modular architecture called the analysis model (AM), which combines several motivated HLFs as the input. We focus on the generator systematics of the top vs. QCD classification by one of the best classifiers, Particle Transformer (ParT). Taking commonly used event generators Pythia (PY) and Herwig (HW) as examples, we demonstrate that the event weights estimated by the AM generator classifier align the HW classification score distribution to PY ones for QCD jets, with small training uncertainty. This suggests the AM is sufficient to describe simulated QCD jet features with relatively few observables, and generator systematics would also be reduced by reweighting the simulation by data. On the other hand, large event weights are required for QCD-like top jets, which leads to imperfect reweighting for both AM and ParT generator classifiers. Moreover, the AM HLFs are insufficient for describing PY and HW differences, causing lower reweighting accuracy compared with ParT. The missing features are the correlation among the collimated high-energy jet constituents, which are strongly correlated to the energy flow polynomials (EFPs) selected for top vs. QCD classification, showing the complementarity between AM HLFs and the selected EFPs.