E-DeflareNet: Physics-Guided Flare Removal
- E-DeflareNet is a physics-guided, learning-based restoration network that effectively removes lens flare artifacts in event camera data.
- It leverages a novel 3D U-Net architecture with residual learning to suppress nonlinear flare effects, achieving state-of-the-art performance.
- Evaluations on simulated and real-world benchmarks show significant improvements in event imaging quality and 3D reconstruction accuracy.
E-DeflareNet is a physics-guided, learning-based restoration network designed to address the problem of lens flare in event camera data. Leveraging a novel physics-based forward model of nonlinear event suppression due to flare, E-DeflareNet achieves state-of-the-art performance in removing flare artifacts from asynchronous event streams. This advancement enables substantial improvements in event-based imaging and 3D reconstruction applications. E-DeflareNet is evaluated on both a large-scale simulated benchmark (E-Flare-2.7K) and a paired real-world dataset (E-Flare-R), exhibiting significant gains across event- and voxel-level metrics (Han et al., 9 Dec 2025).
1. Lens Flare in Event Cameras: Physical Foundation
Event cameras asynchronously record brightness changes at high temporal resolutions, offering advantages for high-dynamic-range vision tasks. However, they remain susceptible to lens flare—optical artifacts induced by internal lens reflections and scatter. In event streams, lens flare produces complex spatio-temporal distortions by injecting spurious events and non-linearly suppressing valid scene events under intense illumination. The forward model describes the observed irradiance as the superposition: where is the background irradiance and is due to flare. Event generation occurs when log-irradiance changes exceed a threshold , leading to the event emission constraint: By differentiating and incorporating the intensity superposition, the resultant event stream exhibits dynamic, time-varying weights: This produces an ideal, virtual event stream: In flare-dominated regions (), background events are strongly suppressed. The physically realizable event stream is recovered by reapplying the event operator to the integrated ideal stream. Notably, there is no closed-form inverse for this process, motivating the development of data-driven restoration methods (Han et al., 9 Dec 2025).
2. E-DeflareNet Architecture
E-DeflareNet is based on a residual 3D U-Net architecture ("TrueResidualUNet3D") that operates on voxelized representations of event data. Both input and output are single-channel voxel grids , where eight temporal bins discretize a 20 ms observation window. The network comprises:
- Four levels of downsampling/upsampling.
- Encoder: residual 3D blocks (pairs of convolutions with ReLU activation), 3D max-pooling ().
- Decoder: transposed convolutions for upsampling, with symmetric skip connections from encoder to decoder.
- Output layer: convolution, zero-initialized and followed by identity activation, enforcing an initial identity mapping.
- Total parameter count: 7.07 M.
The architecture predicts the negative flare residual, enabling residual learning strategies to target only the flare-induced corruption while preserving scene information (Han et al., 9 Dec 2025).
3. Training Approach and Data Resources
E-DeflareNet is trained using a Mean Squared Error (MSE) loss between the restored output voxel grid and the ground truth, with no auxiliary adversarial or perceptual loss components. The objective is formulated as: Training leverages two benchmark resources:
E-Flare-2.7K (Simulated Training Set):
- 2,720 paired samples (20 ms each; voxels), split 2,545 training / 175 test.
- Background event streams sourced from DSEC.
- Flare and light source events synthesized from Flare7K++ assets, augmented via ego-motion scripting, flicker (100–140 Hz), geometric transforms, hybrid rendering (scattering, reflections), and conversion through a DVS simulator.
- Dataset labeling performed via the Probabilistic Non-Linear Event Suppression (PNL-ES) operator.
E-Flare-R (Real-World Paired Test Set):
- Approximately 150 paired sequences of 100 ms at resolution, captured on Prophesee EVK4-HD.
- Two-pass protocol with/without removable optical filter in matched scenes.
- Post-processing includes sub-millisecond temporal alignment, spatial masking, noise injection, and cropping (Han et al., 9 Dec 2025).
4. Quantitative Evaluation
E-DeflareNet outperforms all baselines across both simulated and real-world benchmarks on both event-level and voxel-level metrics. Representative results are as follows.
| Test / Metric | Chamfer (↓) | MSE (↓) | Raw-F1 (↑) |
|---|---|---|---|
| E-Flare-2.7K | 0.4477 | 0.1269 | – |
| Second-best Method | 1.2496 | 0.2851 | – |
| E-Flare-R | 1.1368 | 0.1741 | – |
| Second-best Method | 1.7647 | 0.2761 | – |
On E-Flare-2.7K, the model achieves a 64.2% improvement in Chamfer distance and 55.5% in MSE over the second-best baseline. On E-Flare-R, E-DeflareNet yields a 35.6% improvement in Chamfer distance and 36.9% in MSE. TP-F1 shows a slight decline due to a trade-off optimizing for fidelity. Ablation studies establish the necessity of both the physics-based suppression prior and residual learning. The full model configuration outperforms variants without intensity weighting, with random jitter, or lacking source-event preservation (Han et al., 9 Dec 2025).
5. Impact on Downstream Vision Tasks
E-DeflareNet's utility extends to multiple event-based downstream tasks:
- Event-Based Imaging (SPADE-E2VID): Images reconstructed from de-flared events using SPADE-E2VID on challenging DSEC-Flare test sequences are free from flare halos and recover fine spatial textures that are otherwise occluded.
- Event-Based 3D Reconstruction (Event3DGS): On a LEGO NeRF synthetic scene with simulated flare, the use of E-DeflareNet yields a novel-view PSNR of 13.78 (vs. 13.72 for the original, uncorrupted sequence) and SSIM of 0.792 (compared to 0.765 for the original). E-DeflareNet is the only solution that delivers consistent improvements across all evaluated baselines (Han et al., 9 Dec 2025).
6. Generalizations, Limitations, and Prospects
The proposed forward model extends to other additive optical disturbances, including reflections, occlusions, and participating media, by incorporating dynamic weights . The current network architecture is moderate in size; further optimization could support lightweight or real-time deployment. The fusion of event data with frame-based (RGB) sensors is identified as a promising pathway: leveraging image-domain priors to inform event de-flaring or vice versa. Extreme failure cases are observed under very strong flare, where preservation of background events becomes impossible; addressing these may require higher-order optical modeling or multi-exposure strategies.
In summary, E-DeflareNet advances the state of the art in event-based lens flare removal, offering a validated, physics-guided learning pipeline, paired benchmarks for reproducible evaluation, and demonstrated benefits for critical event-based vision tasks (Han et al., 9 Dec 2025).