- The paper presents N-fit, a modular deep learning method that combines DCNs, MDNs, and transfer learning to overcome limitations of traditional reconstruction techniques.
- It demonstrates significant improvements in angular resolution, reducing mean zenith error from 9.7° to 3.7° and enabling, for the first time, resolvable azimuth estimates.
- The methodology shows robust performance on both simulated and real data, paving the way for enhanced low-energy neutrino searches in underwater Cherenkov telescopes.
Deep Learning for Single-Line Neutrino Event Reconstruction in ANTARES
Introduction and Motivation
Precise reconstruction of low-energy neutrino events in underwater Cherenkov telescopes remains a key technical challenge because energy and directional information are often severely under-constrained, particularly for single-line (SL) events where only one detector line registers signals. Traditional χ2-based and likelihood approaches, while robust for many purposes, have notable limitations in resolving key parameters of SL events, especially the azimuthal component. The work in "Deep Learning Framework for Enhanced Neutrino Reconstruction of Single-line Events in the ANTARES Telescope" (2511.16614) addresses these limitations through a modular, deep learning-based algorithm (N-fit) that integrates deep convolutional neural networks (DCNs), mixture density networks (MDNs), and transfer learning (TL) to optimize event reconstruction and classification.
Figure 1: Schematic illustration of neutrino direction angles θ (zenithal) and ϕ (azimuthal) in the ANTARES detector reference frame.
ANTARES Detector Context and Data Model
ANTARES, an undersea neutrino telescope, uses 12 vertical lines instrumented with optical modules (OMs) to sample Cherenkov light from charged particles produced in neutrino interactions. The detector geometry plays a central role in limiting the amount of information available for SL events. Given the importance of well-controlled simulations, the methodology utilizes detailed Monte Carlo datasets to support both supervised learning and robust generalization to real experimental data.
SL events are categorized as track-like (mostly from νμ charged-current) or shower-like (from other flavors and neutral-current interactions). Standard χ2-fit reconstructions fail to resolve azimuthal information for SL events due to coplanarity and insufficient hit multiplicity.
Overview of the N-fit Architecture
N-fit is constructed as a collection of highly specialized deep neural network modules, each optimized for a specific aspect of the reconstruction pipeline. Key components include:
- Deep Convolutional Networks (DCNs): Applied to processed PMT hit data, formatted as RGB “images” with the spatial (storey) and temporal structures mapped explicitly to tensor axes.
- Mixture Density Networks (MDNs): Employed for probabilistic regression, enabling robust uncertainty estimation essential for downstream physics analyses.
- Transfer Learning (TL): Both direct (layer freezing) and indirect (PCA-based knowledge distillation) TL strategies are leveraged to propagate high-level features learned in spatial reconstruction tasks into energy inference and event classification stages.
Input data is restructured into normalized RGB “images”, with each color channel encoding the projection of OM directions (Figure 2), enabling the DCNs to exploit spatial correlations relevant to the reconstruction of θ and ϕ.
Figure 2: Example normalized RGB image representing a track-like event in the preprocessing pipeline.
A detailed view of a key network submodule is illustrated below.
Figure 3: Architecture of the direction reconstruction neural network for θ and ϕ estimation, including MDN output for uncertainty quantification.
Direction, Position, and Energy Reconstruction
A major outcome of the N-fit framework is a substantial improvement in the angular resolution for SL events. The DCN+MDN structure enables significant gains in θ reconstruction over χ2-fit methods — mean errors reduced from 9.7∘ to 3.7∘ for the best-selected 50% of events — and crucially provides, for the first time, resolvable ϕ predictions with mean errors of approximately 29∘, a strong performance given the physical indeterminacy using older methods.
The modular organization of N-fit includes dedicated submodules for estimating the closest approach point (for tracks) and the interaction vertex (for showers). The internal feature activations from these modules are repurposed via PCA dimensionality reduction and subsequently serve as the basis for energy regression through a lightweight feed-forward network.
Figure 4: Architecture for position (horizontal and vertical) parameter regression for closest point/vertex estimation.
Transfer learning is central for optimal energy estimation, which is especially challenging for SL events due to topology and limited lever arm. Projecting DCN activations onto a subspace that captures the principal explanatory variance yields modest but meaningful gains in regression quality for both shower and track branches.
Figure 5: Energy regression module which utilizes PCA-distilled features from prior spatial networks.
Event Classification and Transfer Learning
The event classification step — distinguishing between track and shower topologies — benefits directly from transfer learning: convolutional blocks from spatial reconstruction networks are frozen and concatenated in parallel, providing high-level feature descriptors as input to the classifier FFN. With this architecture, N-fit achieves accuracies of approximately 80%, with recall and precision metrics for tracks and showers in the 75–85% range.
Figure 6: Classifier architecture integrating frozen convolutional features from multiple spatial reconstruction branches.
Extensive robustness checks, including K-fold cross-validation (both randomly partitioned and temporally sorted), demonstrate negligible dependence on data splits or operational history. Application to pure background inputs produces random reconstructions with high predicted uncertainties, validating the physical behavior of the model in edge cases.
Comparison between MC and real data, performed through distributions of reconstructed zenith and azimuth, reveals close correspondence, particularly once strict quality cuts on angular uncertainty are enforced. This affirms the transferability of N-fit models, trained purely on MC, to real experimental data.

Figure 7: Comparison of reconstructed zenith and azimuth angle distributions for MC simulations and ANTARES data, showing agreement after uncertainty-based quality cut.
Application to an IceCube-alert follow-up on a blazar (PKS 0735+17) illustrates the physics reach of N-fit. The methodology allows computation of upper limits on neutrino fluence at energies significantly below 30 GeV, a regime inaccessible to traditional multi-line reconstructions and previously unattainable for SL events in ANTARES.
Conclusion
This work establishes a rigorous, modular deep learning approach to the reconstruction and classification of single-line events in underwater Cherenkov telescopes. The integration of DCNs, MDNs, and transfer learning produces marked enhancements in angular and energy resolution for challenging event topologies. The modular logic of N-fit ensures flexibility and extensibility for future upgrades, and the demonstrated robustness on both simulated and real data paves the way for cross-detector adoption in next-generation experiments such as KM3NeT.
Theoretical implications include the demonstration that feature transfer across modular DNN structures is essential for maximizing performance given complex, under-constrained inverse problems in astroparticle physics. Practically, N-fit enhances sensitivity in low-energy neutrino searches and multimessenger campaigns, thus expanding the scientific reach of underwater neutrino telescopes.
Future developments may focus on (1) integrating more advanced architectures (e.g., GNNs for topology-agnostic inputs), (2) further improvements in uncertainty quantification via normalizing flows or Bayesian deep learning, and (3) adapting the presented techniques to hybrid or multi-detector environments. The adoption of N-fit’s methodology represents a new standard for analysis in high-dimensional, information-sparse contexts in astroparticle experiments.