It Takes Two: A Duet of Periodicity and Directionality for Burst Flicker Removal

Published 24 Mar 2026 in cs.CV | (2603.22794v1)

Abstract: Flicker artifacts, arising from unstable illumination and row-wise exposure inconsistencies, pose a significant challenge in short-exposure photography, severely degrading image quality. Unlike typical artifacts, e.g., noise and low-light, flicker is a structured degradation with specific spatial-temporal patterns, which are not accounted for in current generic restoration frameworks, leading to suboptimal flicker suppression and ghosting artifacts. In this work, we reveal that flicker artifacts exhibit two intrinsic characteristics, periodicity and directionality, and propose Flickerformer, a transformer-based architecture that effectively removes flicker without introducing ghosting. Specifically, Flickerformer comprises three key components: a phase-based fusion module (PFM), an autocorrelation feed-forward network (AFFN), and a wavelet-based directional attention module (WDAM). Based on the periodicity, PFM performs inter-frame phase correlation to adaptively aggregate burst features, while AFFN exploits intra-frame structural regularities through autocorrelation, jointly enhancing the network's ability to perceive spatially recurring patterns. Moreover, motivated by the directionality of flicker artifacts, WDAM leverages high-frequency variations in the wavelet domain to guide the restoration of low-frequency dark regions, yielding precise localization of flicker artifacts. Extensive experiments demonstrate that Flickerformer outperforms state-of-the-art approaches in both quantitative metrics and visual quality. The source code is available at https://github.com/qulishen/Flickerformer.

Abstract PDF Upgrade to Chat

Authors (6)

Summary

The paper introduces Flickerformer, which integrates periodicity and directionality priors into a transformer to robustly remove flicker artifacts from burst imaging.
It employs FFT-based phase fusion and autocorrelation modules alongside wavelet-based directional attention to precisely localize and suppress structured flicker.
Quantitative evaluations on the BurstDeflicker benchmark reveal significant PSNR, SSIM, and LPIPS improvements, confirming its practical impact on image quality.

Transformer-Based Flicker Artifact Suppression: Periodicity and Directionality Priors Exploited

Problem Formulation and Motivation

Short-exposure photography under AC-powered illumination is fundamentally challenged by spatially and temporally structured flicker artifacts arising from periodic light oscillations and rolling-shutter sensor inconsistencies. Such artifacts degrade image perceptual quality and undermine downstream vision pipelines, especially in dynamic scenarios or burst imaging. Existing restoration frameworks treat flicker as generic noise, thus failing to leverage its inherent physical structure—specifically, periodicity and directionality—resulting in inferior suppression and ghosting.

This paper introduces the Flickerformer architecture, which explicitly integrates periodicity and directionality priors into a transformer-based burst restoration pipeline, yielding systematic improvements in flicker localization and removal.

Methodological Innovations

Periodicity Modeling

The periodic nature of flicker stems from both lighting modulation and sensor exposure mechanisms. Flickerformer operationalizes periodicity via two modules:

Phase-Based Fusion Module (PFM): Inter-frame phase correlation is deployed in the frequency domain to enable robust multi-frame feature aggregation. Fast Fourier Transform (FFT) is applied per frame to extract amplitude and phase spectra, with phase similarity computed by element-wise comparison and used as adaptive frequency-domain weighting for reference frames. The fusion leverages these weights, enhancing flicker localization while avoiding ghosting.
Autocorrelation Feed-Forward Network (AFFN): Intra-frame periodicity is reinforced by calculating spatial autocorrelation via the Wiener-Khinchin theorem—squared magnitude in frequency domain followed by IFFT—which amplifies recurring flicker structures and suppresses uncorrelated noise. AFFN refines fused features through dual-domain processing and depth-wise gated feed-forward layers, further encoding periodical cues.

Directionality Exploitation

Rolling-shutter sensors induce strong directional structuring in flicker, manifesting as horizontally or vertically aligned luminance stripes. Flickerformer leverages this via:

Wavelet-Based Directional Attention Module (WDAM): Haar wavelet decomposition splits the input feature into low-frequency and orientation-specific high-frequency subbands. Directional weights are synthesized from high-frequency horizontal and vertical components via convolution and sigmoid activation. These weights modulate window-based multi-head attention applied only to low-frequency subband, enabling precise identification and restoration of flicker-affected regions, with substantial reduction in computational overhead.

Architecture Overview

Flickerformer integrates PFM, AFFN, and WDAM within a U-shaped encoder-decoder transformer backbone. A burst of three input frames is processed; after initial convolutional feature extraction, PFM fuses across frames, followed by hierarchical encoding and feature refinement (AFFN), with WDAM deployed during decoding. The final flicker-free output is generated via residual learning.

Quantitative and Qualitative Evaluation

Extensive evaluation on the BurstDeflicker benchmark demonstrates Flickerformer’s superiority. Numerical results show consistent outperformance across PSNR, SSIM, and LPIPS metrics:

PSNR: Flickerformer achieves 31.226 dB, a +0.580 dB gain over the second-best method (AST [76]) while maintaining low parameter (3.92M) and FLOP counts.
SSIM and LPIPS: Best-in-class scores, indicating perceptual improvements and structural fidelity.

Visual comparisons highlight Flickerformer's capacity to thoroughly suppress flicker without color deviations or motion ghosting, especially in challenging regions (e.g., screens, extreme light extinction).

Ablation Studies and Module Effectiveness

Ablations confirm substantial gains from each design element:

AFFN improves PSNR by +0.265 dB over FRFN alternatives at equivalent complexity.
WDAM confers a +0.229 dB PSNR increment over the best sparse attention module, with reduced computational cost.
PFM, AFFN, and WDAM individually yield notable performance increments compared to baseline architectures, validating the periodicity and directionality priors.

Limitations and Practical Implications

Flickerformer’s restoration relies on the presence of clean regions across the burst. Complete recovery in scenarios where all burst frames are severely degraded remains problematic. This limitation suggests that future architectures should explore hallucination or global priors to address flicker-induced information gaps.

Practically, Flickerformer sets a new reference for flicker removal in burst imaging, with deployment potential across HDR, slow-motion, and surveillance pipelines.

Theoretical Implications and Future Directions

The approach demonstrates the efficacy of embedding explicit physical priors into deep restoration models, illustrating that structured degradations demand principled architecture design beyond generic restoration. The periodicity-directionality duality may extend to other artifact domains (e.g., moiré, banding) where physical source characteristics are partially known.

Future directions include:

Generalizing phase and wavelet-based priors for broader classes of structured artifacts.
Enhancing burst restoration models with cross-frame attention and global context aggregation to compensate for extreme flicker.
Investigating joint flicker removal and other restoration tasks (e.g., denoising, deblurring) under unified frameworks for compound degradations.

Conclusion

Flickerformer achieves state-of-the-art flicker suppression by coupling frequency-domain periodicity modeling and spatial-directional attention within a transformer framework. Its principled integration of signal processing techniques with advanced attention mechanics substantiates the necessity for physics-aware deep restoration models in challenging burst imaging scenarios. Limitations under severe degradation warrant future exploration of global priors and hybrid restoration paradigms.

Markdown Report Issue