Papers
Topics
Authors
Recent
2000 character limit reached

E-DeflareNet: Physics-Guided Flare Removal

Updated 16 December 2025
  • E-DeflareNet is a physics-guided, learning-based restoration network that effectively removes lens flare artifacts in event camera data.
  • It leverages a novel 3D U-Net architecture with residual learning to suppress nonlinear flare effects, achieving state-of-the-art performance.
  • Evaluations on simulated and real-world benchmarks show significant improvements in event imaging quality and 3D reconstruction accuracy.

E-DeflareNet is a physics-guided, learning-based restoration network designed to address the problem of lens flare in event camera data. Leveraging a novel physics-based forward model of nonlinear event suppression due to flare, E-DeflareNet achieves state-of-the-art performance in removing flare artifacts from asynchronous event streams. This advancement enables substantial improvements in event-based imaging and 3D reconstruction applications. E-DeflareNet is evaluated on both a large-scale simulated benchmark (E-Flare-2.7K) and a paired real-world dataset (E-Flare-R), exhibiting significant gains across event- and voxel-level metrics (Han et al., 9 Dec 2025).

1. Lens Flare in Event Cameras: Physical Foundation

Event cameras asynchronously record brightness changes at high temporal resolutions, offering advantages for high-dynamic-range vision tasks. However, they remain susceptible to lens flare—optical artifacts induced by internal lens reflections and scatter. In event streams, lens flare produces complex spatio-temporal distortions by injecting spurious events and non-linearly suppressing valid scene events under intense illumination. The forward model describes the observed irradiance as the superposition: Iob(x,y,t)≈Ibg(x,y,t)+If(x,y,t)I_{\mathrm{ob}}(x, y, t) \approx I_{\mathrm{bg}}(x, y, t) + I_{\mathbf{f}}(x, y, t) where IbgI_{\mathrm{bg}} is the background irradiance and IfI_{\mathbf{f}} is due to flare. Event generation occurs when log-irradiance changes exceed a threshold cc, leading to the event emission constraint: Lob(t)=L(0)+c∫0tE(τ) dτ+ε(t),∣ε(t)∣<cL_{\mathrm{ob}}(t) = L(0) + c\int_0^t E(\tau)\,d\tau + \varepsilon(t),\quad|\varepsilon(t)|<c By differentiating and incorporating the intensity superposition, the resultant event stream exhibits dynamic, time-varying weights: wbg(t)=Ibg(t)Ibg(t)+If(t),wf(t)=1−wbg(t)w_{\mathrm{bg}}(t) = \frac{I_{\mathrm{bg}}(t)}{I_{\mathrm{bg}}(t)+I_{\mathbf{f}}(t)},\quad w_{\mathbf{f}}(t) = 1-w_{\mathrm{bg}}(t) This produces an ideal, virtual event stream: Eideal(t)=wbg(t)Ebg(t)+wf(t)Ef(t)E_{\mathrm{ideal}}(t) = w_{\mathrm{bg}}(t) E_{\mathrm{bg}}(t) + w_{\mathbf{f}}(t) E_{\mathbf{f}}(t) In flare-dominated regions (If≫IbgI_{\mathbf{f}} \gg I_{\mathrm{bg}}), background events are strongly suppressed. The physically realizable event stream is recovered by reapplying the event operator to the integrated ideal stream. Notably, there is no closed-form inverse for this process, motivating the development of data-driven restoration methods (Han et al., 9 Dec 2025).

2. E-DeflareNet Architecture

E-DeflareNet is based on a residual 3D U-Net architecture ("TrueResidualUNet3D") that operates on voxelized representations of event data. Both input and output are single-channel voxel grids V∈R8×480×640\mathcal{V} \in \mathbb{R}^{8 \times 480 \times 640}, where eight temporal bins discretize a 20 ms observation window. The network comprises:

  • Four levels of downsampling/upsampling.
  • Encoder: residual 3D blocks (pairs of 3×3×33 \times 3 \times 3 convolutions with ReLU activation), 3D max-pooling (232^3).
  • Decoder: transposed 3×3×33 \times 3 \times 3 convolutions for upsampling, with symmetric skip connections from encoder to decoder.
  • Output layer: 1×1×11 \times 1 \times 1 convolution, zero-initialized and followed by identity activation, enforcing an initial identity mapping.
  • Total parameter count: ∼\sim7.07 M.

The architecture predicts the negative flare residual, enabling residual learning strategies to target only the flare-induced corruption while preserving scene information (Han et al., 9 Dec 2025).

3. Training Approach and Data Resources

E-DeflareNet is trained using a Mean Squared Error (MSE) loss between the restored output voxel grid and the ground truth, with no auxiliary adversarial or perceptual loss components. The objective is formulated as: L(θ)=∥Vgt^−Vgt∥22=∑b,y,x(Vgt^(b,y,x)−Vgt(b,y,x))2\mathcal{L}(\theta) = \left\|\widehat{\mathcal{V}_{\mathrm{gt}}} - \mathcal{V}_{\mathrm{gt}}\right\|_2^2 = \sum_{b,y,x} \left(\widehat{\mathcal{V}_{\mathrm{gt}}}(b,y,x) - \mathcal{V}_{\mathrm{gt}}(b,y,x)\right)^2 Training leverages two benchmark resources:

E-Flare-2.7K (Simulated Training Set):

  • 2,720 paired samples (20 ms each; 8×480×6408 \times 480 \times 640 voxels), split 2,545 training / 175 test.
  • Background event streams sourced from DSEC.
  • Flare and light source events synthesized from Flare7K++ assets, augmented via ego-motion scripting, flicker (100–140 Hz), geometric transforms, hybrid rendering (scattering, reflections), and conversion through a DVS simulator.
  • Dataset labeling performed via the Probabilistic Non-Linear Event Suppression (PNL-ES) operator.

E-Flare-R (Real-World Paired Test Set):

  • Approximately 150 paired sequences of 100 ms at 640×480640 \times 480 resolution, captured on Prophesee EVK4-HD.
  • Two-pass protocol with/without removable optical filter in matched scenes.
  • Post-processing includes sub-millisecond temporal alignment, spatial masking, noise injection, and cropping (Han et al., 9 Dec 2025).

4. Quantitative Evaluation

E-DeflareNet outperforms all baselines across both simulated and real-world benchmarks on both event-level and voxel-level metrics. Representative results are as follows.

Test / Metric Chamfer (↓) MSE (↓) Raw-F1 (↑)
E-Flare-2.7K 0.4477 0.1269 –
Second-best Method 1.2496 0.2851 –
E-Flare-R 1.1368 0.1741 –
Second-best Method 1.7647 0.2761 –

On E-Flare-2.7K, the model achieves a 64.2% improvement in Chamfer distance and 55.5% in MSE over the second-best baseline. On E-Flare-R, E-DeflareNet yields a 35.6% improvement in Chamfer distance and 36.9% in MSE. TP-F1 shows a slight decline due to a trade-off optimizing for fidelity. Ablation studies establish the necessity of both the physics-based suppression prior and residual learning. The full model configuration outperforms variants without intensity weighting, with random jitter, or lacking source-event preservation (Han et al., 9 Dec 2025).

5. Impact on Downstream Vision Tasks

E-DeflareNet's utility extends to multiple event-based downstream tasks:

  • Event-Based Imaging (SPADE-E2VID): Images reconstructed from de-flared events using SPADE-E2VID on challenging DSEC-Flare test sequences are free from flare halos and recover fine spatial textures that are otherwise occluded.
  • Event-Based 3D Reconstruction (Event3DGS): On a LEGO NeRF synthetic scene with simulated flare, the use of E-DeflareNet yields a novel-view PSNR of 13.78 (vs. 13.72 for the original, uncorrupted sequence) and SSIM of 0.792 (compared to 0.765 for the original). E-DeflareNet is the only solution that delivers consistent improvements across all evaluated baselines (Han et al., 9 Dec 2025).

6. Generalizations, Limitations, and Prospects

The proposed forward model extends to other additive optical disturbances, including reflections, occlusions, and participating media, by incorporating dynamic weights wi(t)w_i(t). The current network architecture is moderate in size; further optimization could support lightweight or real-time deployment. The fusion of event data with frame-based (RGB) sensors is identified as a promising pathway: leveraging image-domain priors to inform event de-flaring or vice versa. Extreme failure cases are observed under very strong flare, where preservation of background events becomes impossible; addressing these may require higher-order optical modeling or multi-exposure strategies.

In summary, E-DeflareNet advances the state of the art in event-based lens flare removal, offering a validated, physics-guided learning pipeline, paired benchmarks for reproducible evaluation, and demonstrated benefits for critical event-based vision tasks (Han et al., 9 Dec 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to E-DeflareNet.