Papers
Topics
Authors
Recent
2000 character limit reached

LLIE Models: Advances in Low-Light Enhancement

Updated 1 January 2026
  • Low-light image enhancement models are computational methods that restore images captured under poor illumination by addressing degradations like noise, color skew, and contrast loss.
  • They leverage diverse strategies including CNNs, transformer architectures, and generative diffusion techniques to model degradation processes and natural image priors.
  • Recent advancements integrate frequency-domain disentanglement, ISP-driven simulation pipelines, and recognition-aware enhancements to achieve robust PSNR and SSIM improvements.

Low-light image enhancement (LLIE) models constitute a critical research area within computational photography and computer vision, targeting the restoration of visually and semantically faithful images from scenes captured under insufficient illumination. The diversity of physical degradations—contrast loss, quantization noise, color skew, and signal-dependent distortions—necessitates a broad spectrum of approaches spanning data-driven convolutional neural networks (CNNs), transformer-based architectures, generative diffusion models, frequency-domain and latent disentanglement paradigms, as well as optimization-inspired and unsupervised schemes. LLIE research has evolved towards more principled modeling of both degradation processes and natural image priors, with recent advances incorporating explicit degradation-awareness, frequency consistency, adaptable feature quantization, ISP-driven simulation pipelines, and sophisticated fusion or guidance mechanisms. This article surveys the core principles, representative architectures, and performance frontiers of modern LLIE models, with particular focus on research trends from 2023–2026.

1. Fundamental Challenges and Theoretical Foundations

Low-light images acquired in practical scenarios are degraded by a mixture of exposure attenuation, noise amplification, color shift, and non-uniform illumination—all of which are compounded by the signal chain of the digital camera pipeline. Formally, the canonical Retinex model posits a decomposition y=R⊙Iy = R \odot I where yy is the observed image, RR the (assumed illumination-invariant) reflectance, and II the non-negative illumination map. Deep learning–based LLIE models have sought to either learn a direct inverse mapping x=θ−1(y)x = \theta^{-1}(y), optimize for II under explicit priors, or supervise an end-to-end mapping using paired datasets.

Major recent advances challenge the sufficiency of purely pixel-wise or Retinex-based approaches:

  • Explicitly learning degradation representations xDx_D as in LLDiffusion (Wang et al., 2023) allows the model to parameterize complex, non-analytic factors such as noise patterns and color-bias introduced during the actual image formation process.
  • Frequency-space and latent-space disentanglement strategies (e.g., Laplace-pyramid decompositions (Zhou et al., 2024), Fourier/phase separation (Tao et al., 25 Oct 2025), VQ quantization (Wu et al., 16 Oct 2025)) address the intertwined nature of low-frequency (illumination) and high-frequency (detail/noise) degradations.
  • Data-driven ISP modeling and synthesis pipelines simulate real-world RAW-to-sRGB transformations, encompassing stochastic sensor noise, varying white balance, and nonlinear tone/gamma corrections (Wang et al., 16 Apr 2025), thereby facilitating more robust LLIE method training.

2. Model Architectures: Degradation Awareness, Frequency, and Disentanglement

Degradation-Aware Diffusion Models

LLDiffusion (Wang et al., 2023) pioneers the formalization of a degradation-aware LLIE scheme using conditional diffusion. A latent encoder EE extracts a high-dimensional degradation representation xD(x;y)=E(y)x_D(x;y)=E(y), facilitating both simulation of low-light images from clean reference (DGNETDGNET) and enhancement via a dynamic degradation-aware diffusion module:

  • Conditioned denoising: ϵθ(xt,y,E(y),C(y),t)\epsilon_\theta(x_t,y,E(y),C(y),t), where C(y)C(y) is a Retinex-style color map and tt the diffusion timestep.
  • Training entails a joint loss: Ltotal=Ldiff+α∥y−y′∥1L_{total} = L_{diff} + \alpha \|y-y'\|_1, forcing E(y)E(y) to serve both the generative and enhancement pipelines.
  • Diffusion steps obey standard SDEs:

q(xt∣xt−1)=N(xt;1−βtxt−1, βtI)q(x_t \mid x_{t-1}) = \mathcal{N}(x_t;\sqrt{1-\beta_t}x_{t-1},\,\beta_t I)

with dynamic per-step affine modulation guided by E(y)E(y).

Frequency and Latent Disentanglement

Frequency-disentangling paradigms (e.g., advanced Laplace decomposition (Zhou et al., 2024), FSIDNet's two-stage amplitude/phase model (Tao et al., 25 Oct 2025), latent-VQ-based representations (Wu et al., 16 Oct 2025)) consistently enhance performance across backbones:

  • Laplace-pyramid or frequency-separation ensures low-frequency consistency for illumination while decoupling restoration of high-frequency detail/noise.
  • FSIDNet (Tao et al., 25 Oct 2025) orchestrates an amplitude-guided enhancement stage, followed by a phase-guided structure refinement, with frequency–spatial interaction blocks exchanging information across both domains.
  • Latent disentanglement approaches (LDE-Net (Zheng et al., 2024)) replace explicit pixel decomposition with joint content/illumination representation learning in feature space, leading to improved robustness and transferability.

3. Diffusion, Transformer, and Hybrid Generative Approaches

Diffusion generative models have established new LLIE frontiers:

  • Conditional denoising diffusion mechanisms (LLDiffusion (Wang et al., 2023), TriFuse (Islam et al., 2024), GPP-LLIE (Zhou et al., 2024), survey (Adhikarla et al., 7 Oct 2025)) support explicit or learned conditioning on degradation maps, frequency bands, and semantic or perceptual priors.
  • Multi-perspective taxonomies (Adhikarla et al., 7 Oct 2025) classify diffusion LLIE by their conditioning (intrinsic, latent, accelerated, guided, multimodal, autonomous), with corresponding advantages in interpretability, controllability, or data efficiency.
  • Transformer-diffusion hybrids such as TriFuse exploit global context and multi-scale feature fusion, integrating edge-sharpening via wavelet decomposition and cross-attention mechanisms (Islam et al., 2024).
  • Generative perceptual prior models (GPP-LLIE (Zhou et al., 2024)) utilize VLM-driven global and local quality assessments to adapt backbone normalization and attention, directly improving both fidelity and perceptual scores.

Recognition-aware LLIE models have emerged, wherein enhancement is optimized to improve downstream tasks such as pose estimation or segmentation (global/pixelwise optimization framework (Ono et al., 8 Jan 2025)).

4. Training Strategies, Data Synthesis, and Optimization

The quality and diversity of training data remains a principal limiter of real-world LLIE robustness:

  • ISP-driven data modeling pipelines (Wang et al., 16 Apr 2025) synthesize virtually unlimited paired low/high-light data by simulating full RAW-to-sRGB signal chains with randomized exposure, noise, white balance, color correction, and gamma/tone mapping.
  • Diffusion and supervised models trained with such diversified data achieve high PSNR/SSIM on both synthetic and real benchmarks, as well as strong no-reference quality scores and improved performance in high-level tasks (e.g. object detection, segmentation).
  • Compact architectures focused on efficiency (LiteIE (Bai et al., 6 Jul 2025), SCLM (Zhang et al., 2023)) demonstrate that, under sufficient data diversity and principled loss design, high-quality LLIE is feasible with sub-kilobyte parameter counts, enabling real-time enhancement on mobile or embedded devices.

Loss designs span pixel-wise (L1), perceptual (LPIPS, VGG), consistency (low-frequency, lighting-style), adversarial, and task-specific (recognition accuracy) objectives, with hybrid or alternated optimization required to balance conflicting targets.

5. Performance Analysis, Generalization, and Ablations

Comprehensive experimental benchmarks across paired (LOL, LOL-v2, LSRW) and real-world unpaired datasets (DICM, LIME, NPE, LoLI-Street) establish SOTA improvements:

  • LLDiffusion (Wang et al., 2023) achieves PSNR = 24.65 dB, SSIM = 0.843 on LOL; VE-LOL: 31.77 dB (previous best ∼28 dB).
  • FSIDNet (Tao et al., 25 Oct 2025) sets new highs on LOL-Real, LOL-Syn, and LSRW-Huawei, with consistent NIQE reductions on real datasets.
  • GPP-LLIE (Zhou et al., 2024) leads FID, LPIPS, DISTS, and NIQE across both paired and unpaired benchmarks.
  • Linear-fusion frameworks (FusionNet (Shi et al., 27 Apr 2025)) show that orthogonal Hilbert space projections can systematically outperform single-architecture models by convex-combining multiple feature domains.

Ablation studies across all advanced models confirm that explicit modeling of degradation, frequency or latent disentanglement, informed conditioning, and fusion of complementary architectures each deliver measurable, statistically consistent improvements. The introduction of modular plug-ins (e.g. Laplace-consistency (Zhou et al., 2024), ADR/POG redundancy reduction (Li et al., 2024)) yields further additive gains when integrated into diverse backbones.

6. Limitations, Open Problems, and Research Directions

Despite progress, LLIE models exhibit persistent limitations:

  • Extant encoders and degradation extractors may underfit extreme or exotic degradations (e.g. spatially-varying, sensor-specific noise, over-amplified regions with no signal).
  • Transfer to video remains problematic, with temporal consistency unaddressed in most architectures (Wang et al., 2023).
  • Sample-specific fusion or guidance remains an open research direction, as current convex-fusion models use static weights per dataset (Shi et al., 27 Apr 2025).
  • Lightweighting and quantization (ADR/POG (Li et al., 2024), SCLM (Zhang et al., 2023), LiteIE (Bai et al., 6 Jul 2025)) must be further studied for deployment in edge and embedded systems, particularly under memory, energy, and latency constraints.
  • Theoretical understanding of why frequency/latent/disentanglement aids generalization—especially in the unsupervised or zero-shot setting—remains underexplored.
  • Real-world cross-device/camera adaptation and unsupervised training remain partially solved; foundation models and multi-modal or VLM-guided priors represent promising, yet under-exploited, vectors (Zhou et al., 2024, Adhikarla et al., 7 Oct 2025).

Emergent consensus supports:

  • Explicit modeling of (and conditioning on) degradation patterns is essential for robust enhancement under varied real-world conditions.
  • Frequency/latent disentanglement, when paired with global context and adaptive feature learning, further stabilizes color/structure restoration.
  • Diffusion-inference and semantic or instruction-based guidance portend new applications (e.g. task-specific LLIE), provided challenges of efficiency, controllability, and realism are addressed at the architecture and training levels.

7. Summary Table: SOTA LLIE Architectures and Features

Model/Paradigm Key Innovation Conditioning / Priors Best PSNR (LOL/LOL-v2)
LLDiffusion (Wang et al., 2023) Degradation-aware joint diffusion Learned degradation + color map 24.65 / 25.99
FSIDNet (Tao et al., 25 Oct 2025) Frequency–spatial two-stage interaction Fourier amplitude/phase + IEM 22.65 / 24.93
GPP-LLIE (Zhou et al., 2024) VLM-based perceptual prior (global/local) VLM global/local attributes 27.51 / 30.17 (PSNR↑)
LiteIE (Bai et al., 6 Jul 2025) Extreme parameter-minimal unsupervised Unsupervised edge/exposure/color 19.04 (LOL)
FusionNet (Shi et al., 27 Apr 2025) Multi-backbone linear Hilbert fusion sRGB, Retinex, HVI domain 25.17 / 24.44
LightQANet (Wu et al., 16 Oct 2025) Explicit light quantization + adaptive prompts VQ quantization + adaptive prompt 28.51 / 26.15
ISP-driven U-Net (Wang et al., 16 Apr 2025) Realistic data synthesis via full ISP Synthetic RAW/sRGB paired data 23.91 / 23.10 (FT)

For a more comprehensive taxonomy and performance matrix, see (Adhikarla et al., 7 Oct 2025).


In sum, contemporary LLIE models combine explicit degradation modeling, frequency or latent disentanglement, adaptable attention and prompt architectures, and domain-appropriate data synthesis to achieve robust, generalizable, and computationally efficient enhancement. Ongoing advances will likely be driven by further integration of semantic priors, multi-modality, and efficient architecture search, underpinned by improved data-centric design.

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to Low-Light Image Enhancement (LLIE) Models.