Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 167 tok/s
Gemini 2.5 Pro 54 tok/s Pro
GPT-5 Medium 31 tok/s Pro
GPT-5 High 29 tok/s Pro
GPT-4o 92 tok/s Pro
Kimi K2 191 tok/s Pro
GPT OSS 120B 434 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

Boosted-Object Tagging Algorithms

Updated 16 October 2025
  • Boosted-object tagging algorithms are advanced classification methods that detect high-momentum particles by analyzing jet decay patterns and substructure features.
  • They leverage observables like N‑subjettiness and energy correlation functions to distinguish signal jets from overwhelming QCD backgrounds.
  • Hybrid taggers combining traditional cut-based techniques with machine learning improve precision and enable real‑time FPGA implementations in collider experiments.

A boosted-object tagging algorithm is a classification method that seeks to identify highly Lorentz-boosted Standard Model particles such as electroweak bosons, top quarks, and Higgs bosons via their characteristic decay topologies within hadronic jets. When these particles are produced with transverse momentum pT2mp_T \gg 2m, their hadronic decay products are collimated into a single, large-radius (“fat”) jet, whose internal structure can be exploited to distinguish signal from copious QCD backgrounds. Over the past decade, the field has evolved from cut-based taggers built on physically motivated jet substructure observables to sophisticated, interpretable hybrid frameworks incorporating machine learning and real-time hardware implementations. Advances in tagging algorithms are central to LHC new physics searches, precision Standard Model measurements, and detector and trigger design.

1. Core Principles of Boosted-Object Tagging

Boosted-object tagging leverages differences in jet substructure to separate hadronically decaying heavy particles from background QCD jets. The key principle exploits the "prongness" of the energy flow inside jets formed by the decay of a color singlet or triplet (e.g., WW, ZZ, HH, tt):

  • Subjet Multiplicity: Boosted WW, ZZ, or HbbˉH\to b\bar{b} decays yield a two-prong structure; top quark decays (tbWbqqt \to bW \to bqq) generate a three-prong pattern.
  • Jet Clustering and Grooming: Fat jets are reconstructed with clustering algorithms (commonly anti-ktk_t, Cambridge–Aachen, or ktk_t) and then subject to jet grooming (trimming, pruning, filtering, SoftDrop, or dynamical grooming) to remove soft contamination from pileup and the underlying event.
  • Infrared/Collinear Safety: Observables and grooming procedures are constructed to be IR/collinear safe for calculability and stability under soft/collinear emissions.

An archetypal observable is NN-subjettiness (Thaler et al., 2010): τN=1d0kpT,kmin{ΔR1,k,,ΔRN,k}\tau_N = \frac{1}{d_0} \sum_k p_{\mathrm{T},k}\min\{\Delta R_{1,k},\ldots,\Delta R_{N,k}\} with d0=(kpT,k)R0d_0 = (\sum_k p_{\mathrm{T},k})R_0. The discriminating power comes from ratios such as τ21=τ2/τ1\tau_{21} = \tau_2/\tau_1 for two-prong decays or τ32=τ3/τ2\tau_{32} = \tau_3/\tau_2 for top quarks.

2. Tagging Methodologies: Traditional Algorithms

Early and contemporary traditional taggers rely on physically motivated, high-level observables (Behr, 2014, Kasieczka, 2018, Rentala et al., 2014). The most common strategies include:

  • N-subjettiness taggers: Defined above, with cuts on τ21\tau_{21} (for W,Z,HW,\,Z,\,H) and τ32\tau_{32} (for tt) after an invariant mass window. Typical working points yield~40% efficiency for WW jets at 1% mis-tag rate, and ~30% for top quarks at similar fake rates (Thaler et al., 2010).
  • Mass Drop and Symmetry Cuts: Based on the BDRS tagger (Butterworth et al.), requiring that the most massive subjet has mj1<μmjm_{j1} < \mu m_j and (min{pT,j12,pT,j22}/mj2)ΔRj1,j22>ycut(\min\{p_{T,j_1}^2,\,p_{T,j_2}^2\}/m_j^2)\Delta R_{j_1,j_2}^2 > y_{\rm cut} (Rentala et al., 2014, Bose et al., 2 Aug 2024).
  • Groomed Jet Mass: Requiring the mass of the groomed jet to lie in a window around the target resonance (mW,mZ,mH,mtm_W,\,m_Z,\,m_H,\,m_t) reduces QCD backgrounds substantially (Mehtar-Tani et al., 2020, Thaler et al., 2010).
  • Energy Correlation Functions (ECF): Hierarchical correlators capturing angular and energy correlations, including C2C_2, D2D_2, D3D_3 (Bhattacherjee et al., 2022).

Table: Key Traditional Tagging Strategies and Discriminants

Tagger/Observable Discriminant Typical Signal / Fake Rate
NN-subjettiness τ21\tau_{21}, τ32\tau_{32} 40%/1%40\%/1\% (WW), 30%/1%30\%/1\% (top)
BDRS Mass Drop μ,  ycut\mu,\; y_{\rm cut} Used for HbbˉH\to b\bar{b}, see (Bose et al., 2 Aug 2024)
Groomed Mass Jet mass window Background reduction >10×\times
ECF / D-variables D2D_2, D3D_3 ratios Robust to pileup, multi-prong sensitivity

Combined use of mass and substructure improves performance multiplicatively—e.g., S/B enhancement by a factor (ϵsignal/ϵbackground)2\sim (\epsilon_{\text{signal}}/\epsilon_\text{background})^2 when cuts are made on both leading jets in resonance searches (Thaler et al., 2010).

3. Sensitivity to QCD Color Flow and Event Structure

Tagger performance depends crucially on the underlying color structure of the event (Joshi et al., 2012, Salam et al., 2016). Color singlet resonances (e.g., ZttˉZ' \to t\bar{t} via KK photon) generate different jet radiation patterns than octet resonances (e.g., via KK gluon):

  • Extra Radiation: Color-octet decays have more internal and external QCD radiation, modifying subjet kinematics (mass, pTp_T) and increasing mistag rates.
  • Tagger Dependence: Efficiency differences of 15–75% between color singlet and octet signals under tight cuts, especially at low mis-tag rates relevant for discovery (Joshi et al., 2012).
  • Mitigation: Minimize jet radius RR to suppress soft radiation, tune mass windows or use taggers (e.g., HEPToptagger with built-in jet grooming) with reduced color sensitivity.

The use of dichroic subjettiness ratios, defined by measuring τ2\tau_2 on the full jet and τ1\tau_1 on the groomed/tagged jet,

τ21dichroic=τ2fullτ1tagged,\tau_{21}^\text{dichroic} = \frac{\tau_2^\text{full}}{\tau_1^\text{tagged}},

enhances background suppression by exploiting the difference in large-angle soft radiation between color-singlet signal jets and QCD backgrounds. Improvements of \sim25% in signal significance and a reduction in non-perturbative effects by factors 2–3 are observed relative to traditional ratios (Salam et al., 2016).

4. Machine Learning and Hybrid Taggers

Contemporary developments have integrated deep learning and hybrid models (Paganini, 2017, Kasieczka, 2018, Macaluso et al., 2018, Bose et al., 2 Aug 2024, Bhattacherjee et al., 2022). Key elements include:

  • Neural Networks (CNNs, GNNs, RNNs): CNNs process jet images (calorimetric or tracking pTp_T, multiplicities), graph neural networks treat jet constituents as nodes with spatial and kinematic features (Macaluso et al., 2018, Sahu et al., 13 Jan 2025).
  • Feature Engineering vs. End-to-End: Some taggers ingest only low-level inputs (e.g., four-vectors, images); others combine high-level variables (e.g., τ32\tau_{32}, D2D_2) as features to enhance interpretability and performance (Bhattacherjee et al., 2022).
  • Hybrid Taggers: Combine traditional variables with ML outputs, e.g., GNN-derived class scores as inputs to a boosted decision tree (BDT) for event selection, yielding performance greater than either approach alone (Sahu et al., 13 Jan 2025, Bose et al., 2 Aug 2024).
  • Interpretability Tools: Shapley value analysis (SHAP) quantifies input variable importance in decision trees or ensemble models, elucidating which observables are most predictive (Bhattacherjee et al., 2022, Chowdhury et al., 2023).

Table: Machine Learning Approaches and Inputs

ML Tagger Type Input Features Role
CNN/DeepTop Jet images (pTp_T maps etc.) End-to-end jet class. (WW, top, QCD…)
XGBoost BDT High-level observables Variable ranking, hybrid taggers
GNN/LorentzNet Jet constituent 4-vectors, graphs Fat jet multiclassification (top/H/QCD)

Hybrid methods can handle both SM and BSM signatures, including rare flavor-violating decays and explore unexplored boosted regimes (Chowdhury et al., 2023, Zhao et al., 28 Feb 2025).

5. Hardware, Real-Time Application, and Latency Constraints

With increasing trigger rates and HL-LHC luminosity, real-time, FPGA-based ML triggers are required for prompt event selection (Bileska, 8 May 2025):

  • FPGA Deployment: Models (e.g., WOMBAT) are distilled into quantized, resource-constrained versions running on FPGA hardware at L1 trigger, achieving <22<22 clock cycle latency for identification of HbbˉH\rightarrow b\bar{b} on calorimeter trigger primitives (Bileska, 8 May 2025).
  • Latency and Rate Control: WOMBAT achieves comparable signal efficiency at an offline pTp_T threshold of 146.8 GeV (40.6 GeV lower than traditional single-jet triggers) at a 1 kHz output rate.
  • Knowledge Distillation: Separates high-fidelity master models (offline) from apprentice (quantized, real-time) models, enabling practical, efficient real-time inference with constrained resources.

FPGA-based ML triggers demonstrate that sophisticated jet substructure algorithms can operate within the tight timing and resource restrictions of LHC trigger systems, with significant implications for Phase-2 and beyond.

6. Practical Impact and Applications

Boosted-object tagging algorithms underpin a broad range of LHC applications:

  • Standard Model Measurements: Improved top tagging enhances Vcb|V_{cb}| extractions using boosted bcbc signatures with in-situ calibration, yielding \sim30% improved precision under HL-LHC expectations (Zhao et al., 28 Feb 2025).
  • Exotic and BSM Searches: Improved sensitivity to rare/topologies, e.g., H±bcH^\pm\to bc searches may gain a factor of 2–5 in reach via AI-based taggers (Zhao et al., 28 Feb 2025), and dedicated taggers probe rare flavor-violating top decays (tcHt\to cH) in the boosted regime (Chowdhury et al., 2023).
  • Trigger and Data Acquisition: Real-time selection of boosted Higgs, top, and boson decays at low operation thresholds, crucial for Phase-2 trigger design (Bileska, 8 May 2025).
  • Interpretability: SHAP-assisted hybrid taggers rank variables by explanatory power; mass, NN-subjettiness, and ECFs are consistently dominant contributors (Bhattacherjee et al., 2022).

Supported by a suite of methodologies, boosted-object tagging algorithms enable efficient selection, precision paper, and physical understanding of heavy objects under challenging experimental conditions. Their interpretability, adaptability to hardware, and robustness to QCD and detector effects will remain essential in the HL-LHC and in future collider environments.

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Boosted-Object Tagging Algorithms.