Papers
Topics
Authors
Recent
Search
2000 character limit reached

ERIENet: Efficient RAW Image Enhancement

Updated 20 December 2025
  • The paper demonstrates that ERIENet achieves real-time low-light RAW image enhancement using a fully-parallel, multi-scale feature extraction strategy.
  • The model integrates a channel-aware residual dense block and green channel guidance to optimize feature reuse and adaptive normalization.
  • Quantitative tests show ERIENet outperforms SOTA methods with reduced FLOPs, lower parameter count, and competitive PSNR and SSIM metrics.

The Efficient RAW Image Enhancement Network (ERIENet) is an architecture specifically designed for enhancing RAW images captured in low-light environments. ERIENet introduces a parallel multi-scale feature extraction scheme and leverages the inherent properties of RAW Bayer data—particularly the green channel dominance—to achieve superior quality and computational efficiency compared to previous methods. It achieves real-time performance on high-resolution images with significantly reduced FLOPs and parameter count relative to state-of-the-art (SOTA) approaches while delivering high-fidelity enhancement results (Wang et al., 17 Dec 2025).

1. Parallel Multi-Scale Architecture

ERIENet eschews traditional sequential multi-scale encoders in favor of a fully-parallel, multi-scale feature extraction and fusion architecture. An input RAW Bayer image I∈RH×W×1I \in \mathbb{R}^{H \times W \times 1} is reshaped ("packed") as Iinput∈R(H/2)×(W/2)×4I_\mathrm{input} \in \mathbb{R}^{(H/2) \times (W/2) \times 4} into an RGGB 4-channel representation. Three parallel branches process IinputI_\mathrm{input} at downsampled resolutions corresponding to scale factors s∈{4,8,16}s \in \{4, 8, 16\}. Each branch uses a depth-wise-separable convolution (DSConvs\mathrm{DSConv}_s) to extract Fs=DSConvs(Iinput)F_s = \mathrm{DSConv}_s(I_\mathrm{input}), reducing spatial footprint and computational cost.

Feature extraction in each branch is conducted independently using stacked Channel-Aware Residual Dense Blocks (CRDB, see Section 2). At the lowest resolution (scale s=16s=16), features F16F_{16} are modulated by a learnable channel-wise global mask.

Progressive fusion is performed as follows:

  • E8′=Conv3×3(Concat(E8,Up2(E16)))E'_8 = \mathrm{Conv}_{3 \times 3}(\mathrm{Concat}(E_8, \mathrm{Up}_2(E_{16})))
  • E4′=Conv3×3(Concat(E4,Up2(E8′)))E'_4 = \mathrm{Conv}_{3 \times 3}(\mathrm{Concat}(E_4, \mathrm{Up}_2(E'_8)))
  • Iout=ResBlock(Up2(E4′))+skip_connectionI_\mathrm{out} = \mathrm{ResBlock}(\mathrm{Up}_2(E'_4)) + \text{skip\_connection}

This approach significantly reduces the depth and latency of any single branch, concentrates heavy computation at the coarsest resolution, and allows effective multi-scale context aggregation.

2. Channel-Aware Residual Dense Block (CRDB)

The CRDB module extends the classic Residual Dense Block (RDB) by appending an Efficient Channel Attention (ECA) mechanism. For CRDB-n (n convolutions):

  • Input X0X_0 passes through nn sequential 3×33 \times 3 Conv–ReLU units to yield X1,...,XnX_1, ..., X_n.
  • Intermediate representations are concatenated to Xcat=concat(X1,...,Xn)X_\mathrm{cat} = \mathrm{concat}(X_1, ..., X_n).
  • Feature fusion uses a 1×11 \times 1 convolution: Xf=Conv1×1(Xcat)X_f = \mathrm{Conv}_{1 \times 1}(X_\mathrm{cat}).
  • Channel recalibration: w=σ(Conv1Dk(GAP(Xf)))w = \sigma(\mathrm{Conv1D}_k(\mathrm{GAP}(X_f))) (with k=3k=3), where GAP\mathrm{GAP} denotes global average pooling.
  • Output: Xout=Xf⊙wX_\mathrm{out} = X_f \odot w.
  • Final output: CRDB(X0)=X0+Xout\mathrm{CRDB}(X_0) = X_0 + X_\mathrm{out} via residual addition.

This design provides improved feature reuse, lightweight channel attention, and reduced FLOPs relative to vanilla RDBs.

3. Green Channel Guidance Branch

ERIENet explicitly capitalizes on the green channel's predominance in Bayer patterns. The Green Channel Guidance (GCG) branch operates only at the coarsest scale (s=16s=16):

  • Extract the two green channels, G∈R(H/32)×(W/32)×2G \in \mathbb{R}^{(H/32) \times (W/32) \times 2}, from IinputI_\mathrm{input}.
  • Compute scale (γ\gamma) and shift (β\beta) parameters using two 3×33 \times 3 convolutions: γ=Conv3×3(G)\gamma = \mathrm{Conv}_{3 \times 3}(G) and β=Conv3×3(G)\beta = \mathrm{Conv}_{3 \times 3}(G).
  • For each input FF, compute SAN (Spatially-Adaptive Normalization):

μ=Mean(F) σ=Std(F) SAN(F)=γ⊙((F−μ)/σ)+β\mu = \mathrm{Mean}(F) \ \sigma = \mathrm{Std}(F) \ \mathrm{SAN}(F) = \gamma \odot ((F - \mu)/\sigma) + \beta

  • SAN is integrated by replacing conventional BatchNorm in CRDB at s=16s=16, such that the green channel-derived illumination guides adaptive normalization of feature maps.

Ablation studies confirm that including GCG with BatchNorm yields pronounced improvements in PSNR, SSIM, and LPIPS compared to variants omitting GCG or using LayerNorm.

4. Loss Functions and Training Paradigm

ERIENet is trained end-to-end with a composite objective function:

  • L1 loss: L1=∥Iout−Igt∥1L_1 = \|I_\mathrm{out} - I_\mathrm{gt}\|_1
  • Wavelet SSIM loss:

Lwssim=∑i,wri⋅(1−SSIM(DWTwi(Iout),DWTwi(Igt)))L_\mathrm{wssim} = \sum_{i, w} r_i \cdot (1 - \mathrm{SSIM}(DWT^i_w(I_\mathrm{out}), DWT^i_w(I_\mathrm{gt})))

using three-level 2D Haar DWT, across subbands w∈{LL,LH,HL,HH}w \in \{\mathrm{LL}, \mathrm{LH}, \mathrm{HL}, \mathrm{HH}\}.

  • Wavelet MSE loss:

Lwmse=MSE(Iout,Igt)+∑t=13MSE(DWTt(Iout),DWTt(Igt))L_\mathrm{wmse} = \mathrm{MSE}(I_\mathrm{out}, I_\mathrm{gt}) + \sum_{t=1}^3 \mathrm{MSE}(DWT^t(I_\mathrm{out}), DWT^t(I_\mathrm{gt}))

The total loss is L=L1+αwssimLwssim+αwmseLwmseL = L_1 + \alpha_\mathrm{wssim} L_\mathrm{wssim} + \alpha_\mathrm{wmse} L_\mathrm{wmse} with αwssim=αwmse=0.5\alpha_\mathrm{wssim} = \alpha_\mathrm{wmse} = 0.5. Training is conducted for 500 epochs using Adam optimizer with β1=0.5\beta_1=0.5, β2=0.999\beta_2=0.999, on 512×512512 \times 512 patches with augmentation, using the SID (Sony Bayer) and ELD datasets, and sRGB ground truth images obtained with Rawpy.

5. Computational Efficiency and Comparative Performance

ERIENet demonstrates significant improvements in computational metrics while providing SOTA enhancement quality. On an RTX 3090 (24 GB), it achieves inference on 4K4\mathrm{K}-resolution RAW images at 146 FPS, with 6.84 ms per image, 39.29 GFLOPs, and 1.419M parameters.

Performance comparison (SID [9] test subset):

Method PSNR SSIM FLOPs Params FPS(4K)
SID [9] 28.62 0.798 523.83G 7.761M N/A
RAWFormer 29.22 0.790 781.54G 3.401M N/A
SMG 30.17 0.834 1274.27G 18.355M N/A
ERIENet 29.12 0.797 39.29G 1.419M 146.2

Ablation studies attribute ERIENet's performance to (a) its fully parallel multi-scale strategy, (b) the use of GCG, and (c) the CRDB module. The full three-branch design yields the largest performance gain relative to single- or two-branch variants, while GCG and CRDB offer substantial incremental improvements in all perceptual metrics.

6. Quantitative and Qualitative Evaluation

Quantitative metrics on SID (Sony Bayer subset):

Method Time (ms) GFLOPs Params (M) PSNR (dB) SSIM LPIPS
RAWFormer 20.69 781.54 3.401 29.22 0.790 0.258
SMG 7.72 1274.27 18.355 30.17 0.834 0.238
Dnf 2874.44 11.140 0.797 30.62 0.797 0.343
ERIENet 6.84 39.29 1.419 29.12 0.797 0.259

Qualitative assessment indicates reduced artifacts in highlights, improved dark-region detail fidelity, and perceptually better outputs compared to competing methods.

7. Implications and Future Extensions

ERIENet validates the hypothesis that fully-parallel, multi-scale feature extraction—augmented with green channel guidance and efficient dense block modules—can yield both real-time inferencing capability and enhanced reconstruction accuracy for low-light RAW images at 4K4\mathrm{K} resolution. Potential extensions include adapting the architecture for video (exploiting temporal structure for further denoising or enhancement), integrating enhancement with downstream recognition tasks, and exploring dynamic scale weighting or neural architecture search to maximize efficiency on edge hardware (Wang et al., 17 Dec 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Efficient RAW Image Enhancement Network (ERIENet).