Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 189 tok/s
Gemini 2.5 Pro 53 tok/s Pro
GPT-5 Medium 36 tok/s Pro
GPT-5 High 36 tok/s Pro
GPT-4o 75 tok/s Pro
Kimi K2 160 tok/s Pro
GPT OSS 120B 443 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

DeepTau Algorithm Overview

Updated 11 November 2025
  • DeepTau is a CNN-based tau lepton identification method that uses image-based PF representations and domain adaptation to enhance discrimination of genuine taus from backgrounds.
  • It achieves a 30–50% reduction in jet misidentification rates at fixed tau efficiency, thereby improving simulation fidelity to collision data.
  • Its architecture integrates multiple input grids, high-level features, and an adversarial branch, ensuring robust calibration and effective performance in CMS analyses.

DeepTau is a tau lepton identification algorithm developed for the Compact Muon Solenoid (CMS) experiment at the Large Hadron Collider (LHC), designed to discriminate hadronic tau decays (τh\tau_\mathrm{h}) from backgrounds such as quark and gluon jets, electrons, and muons. Employing convolutional neural network (CNN) techniques and, in its version 2.5, domain adaptation via backpropagation, DeepTau v2.5 substantially improves both the fidelity of simulation to collision data and the overall identification performance, achieving a 30–50% reduction in jet-to-tau misidentification at fixed efficiency. Its design leverages “image-based” representations of particle-flow (PF) objects around each candidate and incorporates robust calibration strategies for direct usage in physics analyses at s=13\sqrt{s}=13 and $13.6$ TeV.

1. Architecture and Input Representation

DeepTau v2.5 utilizes a multi-branch CNN architecture designed to exploit spatial and high-level feature representations of particles near candidate τh\tau_\mathrm{h} objects:

  • Input Construction:
    • Two overlapping grids in the η\etaϕ\phi plane are centered on each HPS-reconstructed τh\tau_\mathrm{h} candidate:
    • Inner grid: 11×1111 \times 11 cells, each of size Δη×Δϕ=0.02×0.02\Delta\eta \times \Delta\phi = 0.02 \times 0.02 (corresponding to the signal cone with radius R=0.1R=0.1).
    • Outer grid: 21×2121 \times 21 cells, each of size 0.05×0.050.05 \times 0.05 (corresponding to the isolation cone with radius R=0.5R=0.5).
    • Each cell encodes up to seven PF-reconstructed particle types (electron, muon, photon, charged hadron, neutral hadron, and standalone electron/muon) with associated kinematical and identification features (such as pTp_T, Δη\Delta\eta, Δϕ\Delta\phi, electric charge, PUPPI weights, electromagnetic/hadronic calorimeter cluster properties, and track-to-vertex compatibility).
  • Additional Features:
    • 43 high-level variables summarize τh\tau_\mathrm{h} kinematics (e.g., η\eta, pTp_T, charge, decay mode prongs), isolation balances, leading track and secondary vertex information, variables for discrimination between electromagnetic and hadronic showers, pileup characteristics, etc.
  • Network Structure:
    • The architecture consists of three distinct branches:
    • 1. The 43 high-level features processed by fully connected (FC) layers.
    • 2. The inner 11×11×N11 \times 11 \times N grid processed by convolutional and pooling layers, yielding a 1×1×Minner1 \times 1 \times M_\text{inner} embedding.
    • 3. The outer 21×21×N21 \times 21 \times N grid processed similarly to provide a 1×1×Mouter1 \times 1 \times M_\text{outer} representation.
    • Outputs from all branches are concatenated and passed through a stack of FC layers and a final softmax layer producing per-class scores y^=[y^e,y^μ,y^τ,y^jet]\hat{y} = [\hat{y}_e, \hat{y}_\mu, \hat{y}_\tau, \hat{y}_\text{jet}].
    • Parametric ReLU (PReLU) activation is used:

    f(x)=max(0,x)+αmin(0,x)f(x) = \max(0, x) + \alpha \min(0, x) - Batch normalization and dropout (O(10\mathcal{O}(10--20%)20\%) per FC layer) are applied for regularization.

2. Domain Adaptation via Backpropagation

To reduce data–simulation discrepancies, DeepTau v2.5 employs a domain adaptation strategy using a gradient reversal layer (GRL):

  • Gradient Reversal Layer (GRL):

    • Inserted between the shared feature-extracting layers and an adversarial branch tasked with classifying the source domain (simulation vs. real data).
    • Forward pass: identity; backward pass: multiplies the incoming gradient by λ-\lambda for domain loss, effectively reversing it.
  • Loss Functions:
    • Classification loss (Lclass\mathcal{L}_\text{class}): Combines categorical cross-entropy (for genuine τh\tau_\mathrm{h}), focal loss components (for overall background discrimination), and targeted cross-entropy penalties for separating jets, electrons, and muons when y^τ\hat{y}_\tau is large.
    • Adversarial (domain) loss:

    Ldomain=[dlny^adv+(1d)ln(1y^adv)]\mathcal{L}_\text{domain} = -\bigl[d \ln \hat{y}_\text{adv} + (1-d) \ln (1-\hat{y}_\text{adv})\bigr]

    where d=1d=1 for data, $0$ for simulation. - Combined objective:

    Ltotal=Lclass+λLdomain\mathcal{L}_\text{total} = \mathcal{L}_\text{class} + \lambda\,\mathcal{L}_\text{domain}

    GRL ensures that the gradient from Ldomain\mathcal{L}_\text{domain} is negated in the feature-extraction trunk:

    G=wLclassλwLdomainG = \nabla_w \mathcal{L}_\text{class} - \lambda \nabla_w \mathcal{L}_\text{domain} - This leads to domain-invariant feature learning, especially in regions with high purity of genuine τh\tau_\mathrm{h} candidates.

3. Training Datasets, Workflow, and Hyperparameters

  • Datasets:

    • Simulation (2018 conditions): Mix of Z+jets, W+jets, ttˉt\bar{t}, single-top, diboson, HττH\rightarrow\tau\tau, QCD multijet, “τ\tau-gun,” ZeeZ'\rightarrow ee processes, ensuring uniform pTp_T and η\eta distributions per class.
    • Real data (2018, 13 TeV, 60 fb1^{-1}): Zτμτh\rightarrow\tau_\mu\tau_h (“μτ\mu\tau control sample”) where τh\tau_\mathrm{h} purity is approximately 76%76\% is used exclusively for domain adaptation.
  • Workflow:
    • Main optimizer (Adam/NAdam) for shared and classification layers, learning rate 104\approx 10^{-4}.
    • Separate optimizer (Adam) for domain branch, learning rate 105\approx 10^{-5}.
    • Domain loss weighting: λ10\lambda \approx 10.
  • This staged training decouples classification performance from simulation–data mismodeling, reducing systematic uncertainties associated with modeling detector effects.

4. Performance Evaluation

  • Metrics:
    • τh\tau_\mathrm{h} identification efficiency: ετ=# genuine τh passing WP# genuine τh\varepsilon_\tau = \frac{\#~\text{genuine}~\tau_\mathrm{h}\ \text{passing WP}}{\#~\text{genuine}~\tau_\mathrm{h}}.
    • Misidentification (fake) rate: fjet=# jets passing τh WP# jetsf_\text{jet} = \frac{\#~\text{jets passing}~\tau_\mathrm{h}~\text{WP}}{\#~\text{jets}}.
  • Results at 13 TeV (2018 sim):
    • At fixed τh\tau_\mathrm{h} efficiency, DeepTau v2.5 achieves marked reductions in fake rates compared to v2.1:
    • For ετ60%\varepsilon_\tau \simeq 60\%: fjetf_\text{jet} reduced from 1.2%\sim1.2\% to 0.6%0.6\%.
    • For ετ80%\varepsilon_\tau \simeq 80\%: fjetf_\text{jet} reduced from 3.0%\sim3.0\% to 1.5%1.5\%.
    • Electron misidentification reduced by up to 50%\sim50\% at the tightest working points; muon misidentification remains 0.1%\ll0.1\%.
  • Robustness at 13.6 TeV (2022 data):
    • Despite being trained on 2018 data, domain adaptation reduces data–simulation disagreement in high–DjetD_\text{jet} regions to 2%\lesssim2\%, compared to 17%\sim17\% pre-adaptation.

5. Calibration Strategies and Application in Analyses

  • Tag-and-probe Calibration:
    • Tag-and-probe methods in ZτμτhZ\to\tau_\mu\tau_h and ZτeτhZ\to\tau_e\tau_h events are used to fit visible mass (mvism_\text{vis}) distributions and extract:
    • τh\tau_\mathrm{h} energy scale corrections (TES): ΔE/E\Delta E/E within ±3%\pm3\%.
    • τh\tau_\mathrm{h} identification scale factors (SFτID=εdata/εsim_{\tau\text{ID}} = \varepsilon_\text{data}/\varepsilon_\text{sim}) within $0.9$–$1.1$.
    • Both individual (fix TES, fit SF) and combined (profile likelihood in both TES and SF) fitting procedures are implemented.
  • High-pTp_T Calibration:
    • In WτνW^*\to\tau\nu events with mW>200m_{W^*}>200 GeV, control regions and mT(τh,MET)m_T(\tau_h,\text{MET}) fits provide SFτID_{\tau\text{ID}} in pTp_T bins: [100–200], >>200 GeV. The resulting SFs are 1.0±(81.0\pm(816%)16\%) at high pTp_T.
  • Lepton-to-τh\tau_\mathrm{h} Misidentification Calibration:
    • ZμμZ\to\mu\mu (“μτ\mu\tau probe”) and ZeeZ\to ee (“eτe\tau probe”) tag-and-probe methods are used to determine mis-ID rate scale factors (SFs) as functions of η|\eta| and τh\tau_\mathrm{h} decay mode:
    • SFμτh(η)_{\mu\to\tau_h}(|\eta|) rises to 1.2\sim1.2 for η>2.1|\eta|>2.1.
    • SFeτh_{e\to\tau_h} typically within $5$–10%10\% depending on decay mode and detector region.
  • Systematic uncertainties (including luminosity, trigger/isolation, background shaping, mis-ID energy scale, and PDF/scale variations for WW^*) are catalogued for each DeepTau v2.5 working point, and correction factors are propagated to CMS physics analyses for s=13\sqrt{s}=13 and $13.6$ TeV.

6. Context, Significance, and Outlook

DeepTau v2.5's use of image-based PF encoding, advanced convolutional architectures, adversarial domain adaptation, and extensive calibration achieves significant improvements in distinguishing genuine τh\tau_\mathrm{h} from jets and other fakes. The 30–50% reduction in jet misidentification for fixed signal efficiency, combined with the reduction of data–simulation discrepancies to a few percent across both LHC Run 2 and early Run 3 datasets, enhances the reliability of CMS analyses involving τh\tau_\mathrm{h} signatures.

A plausible implication is that further developments could continue to target robustness to changing detector conditions and evolving pileup profiles, leveraging similar domain adaptation frameworks. The algorithm's modular, image-based structure provides a foundation for ongoing improvement and adaptation to future LHC datasets.

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to DeepTau Algorithm.