Extreme Low-Light Denoising Dataset (ELD)
- ELD is a benchmark for raw-domain image denoising under extreme low-light, featuring multi-camera acquisitions and rigorous physical noise calibration.
- It employs detailed noise modeling—including photon shot, long-tailed read, row banding, and quantization noise—to generate authentic noisy/clean raw pairs.
- The dataset enables synthetic noise generation and cross-sensor adaptation, achieving competitive PSNR/SSIM metrics and robust denoising performance.
The Extreme Low-Light Denoising (ELD) Dataset is a rigorously curated benchmark explicitly focused on the evaluation and development of raw-domain image denoising techniques under conditions of extremely low photon count, where sensor noise processes dominate and conventional learning-based enhancement models often fail. The ELD dataset is distinguished by its multi-camera acquisition regime, precise per-camera-and-ISO physical noise calibration, inclusion of challenging sensor-specific effects (long-tailed read noise, row banding), and scene capture protocols that minimize confounders such as stacking artifacts, misregistration, and uncontrolled lighting, establishing it as a canonical resource for sensor-physics–aware denoising research (Wei et al., 2020, Wei et al., 2021).
1. Dataset Composition and Acquisition Protocol
The ELD dataset consists of 240 raw image pairs acquired in controlled laboratory conditions (Wei et al., 2020, Wei et al., 2021). The dataset encompasses:
- Cameras: Four CMOS DSLR cameras are employed, each with distinct sensor architectures (Sony α7S II, Nikon D850, Canon EOS 70D, Canon EOS 700D) spanning full-frame and APS-C formats.
- Scene Variations: Ten different indoor static scenes per camera.
- Exposure Settings: For each scene, raw reference (“ground truth”) images are captured under the camera's base ISO with a long exposure, providing high-SNR clean signals. Noisy counterparts are obtained at ISOs 800, 1600, and 3200, with the exposure reduced by low-light factors , resulting in pseudo-ISO values up to 640,000 in the most challenging cases.
- Pair Structure: For each camera, scene, ISO, and low-light factor combination, a noisy/clean raw pair is recorded, leading to a total of $4$ cameras × $10$ scenes × $3$ ISOs × $2$ factors = $240$ pairs.
- Data Format: Images are saved as linear Bayer RAW (12–14 bit) with per-channel black-level correction. For network training, each patch is transformed into a 4-channel tensor (R, G₁, G₂, B), and data augmentation is based on random 512 × 512 crops and geometric transformations.
- Illumination Control: All scenes are shot under dim DC illumination (20 lux) to suppress temporal flicker, ensuring noise statistics are governed by the sensor, not environmental instability (Wei et al., 2021).
2. Physical Noise Model and Calibration
The ELD dataset departs from legacy heteroscedastic Gaussian approximations, instead adopting a physically accurate model of sensor data formation:
where:
- : ideal electron count (signal);
- : analog+digital gain (ISO-dependent);
- : photon shot noise;
- : zero-mean long-tailed read noise;
- : row (banding) noise;
- : ADC quantization noise.
Calibration Procedure
- Gain (): Estimated per-camera/ISO through the slope of a photon transfer curve (variance vs. mean on flat-field frames).
- Banding/Read Noise: Row banding () is modeled by a per-row normal distribution and fit via Fourier/bias frame analysis. Read noise’s heavy tail is quantified by fitting a Tukey– distribution to the residuals after row subtraction, using the Probability Plot Correlation Coefficient (PPCC) procedure for and probability plots for .
- Parameter Sampling: For synthesizing realistic noise, per-camera/ISO models are used to sample plausible , , , .
The comprehensive inclusion of per-row banding and non-Gaussian read noise distinguishes ELD from previous datasets, which typically approximate total noise as , failing to capture long-tail statistics and real sensor artifacts (Wei et al., 2020).
3. Dataset Structure, Preprocessing, and Access
Dataset Statistics
| Camera | Sensor Format | Scenes | ISOs | Low-light Factors | Pairs |
|---|---|---|---|---|---|
| Sony A7S II | Full-frame | 10 | 800/1600/3200 | 100, 200 | 60 |
| Nikon D850 | Full-frame | 10 | 800/1600/3200 | 100, 200 | 60 |
| Canon EOS 70D | APS-C | 10 | 800/1600/3200 | 100, 200 | 60 |
| Canon EOS 700D | APS-C | 10 | 800/1600/3200 | 100, 200 | 60 |
- No dynamic content: All scenes are static and meticulously controlled for lighting. Camera is tripod-mounted; pixel alignment is exact. No temporal stacking or burst averaging—each ground truth is a single long exposure.
- Data Preparation: After black-level subtraction and packing, data are cropped to for patch-based training. No demosaicing is performed prior to denoising.
- Train/Val/Test: The original ELD papers utilize the full set for evaluation and ablation. No canonical split is prescribed (Wei et al., 2020).
Dataset Availability and Licensing
- The dataset itself is not directly downloadable. Calibration code (for noise synthesis and device adaptation) is available at https://github.com/Vandermode/NoiseModel. Usage of ELD images requires contacting the corresponding author (Wei et al., 2020).
4. Synthetic Noise Generation and Cross-Sensor Generalization
A key objective of ELD is to enable creation of large-scale synthetic low-light noise datasets whose statistics faithfully match those of real extreme dark conditions on arbitrary devices.
Synthetic Pair Generation Protocol
- Starting from clean patches (either from ELD references or external RAW corpora), each patch is divided by to emulate photon starvation.
- Noise parameters are sampled from the jointly calibrated per-camera/ISO distributions.
- The full physical noise process (photon, read, row, quantization) is simulated: , with , , , . The patch is then multiplied by to restore the dynamic range.
Cross-Sensor Adaptation
- Retraining or fine-tuning denoising models using patches synthesized via ELD’s noise model enables robust transfer to real data from new devices. Only a fresh set of bias and flat-field frames is needed for the new camera, avoiding the labor of collecting real noisy/clean pairs on every platform.
- Comparative experiments demonstrate that U‐Net denoisers trained with this synthetic protocol attain PSNR/SSIM metrics as high as (or surpassing) those trained directly on real pairs, and are markedly more robust to sensor-specific color and texture artifacts (Wei et al., 2021).
5. Benchmarking, Evaluation Results, and Impact
The ELD dataset has emerged as a reference standard for denoising method evaluation in the extreme low-light regime.
Evaluation Protocol
- The ELD papers report quantitative results using PSNR and SSIM metrics. Denoisers are trained under (i) real pairs, (ii) simple Gaussian+Poisson synthesis, and (iii) ELD’s physics-based synthesis, then tested on ELD holdout pairs.
- Typical PSNR values for the ELD synthetic-noise trained model reach or slightly exceed those achieved with real-pair training. For instance, on the Sony A7S II, , the reported metrics are 44.50 dB / 0.971 (real) vs. 45.44 dB / 0.975 (synthetic). Similar trends hold for other cameras and difficulty levels.
- Qualitative analysis highlights ELD-calibrated models’ ability to remove row-banding, correct color bias, and preserve fine structure—capabilities lacking in models trained solely on other devices or with oversimplified noise assumptions.
- The physically derived nature of noise calibration in ELD enables machine learning methods to generalize “out of the box” to new sensors and configurations, reducing domain overfitting versus training solely on another platform’s image pairs (Wei et al., 2021).
6. Comparison with Related Datasets and ELD’s Role in the Field
The ELD dataset was introduced in response to limitations in existing benchmarks:
- Compared to SID (See-in-the-Dark): SID captures ~510 pairs using two cameras and two to three darkness factors. ELD quadruples camera diversity, offers rigorous per-camera physical calibration, and increases the “noise factor” granularity (Wei et al., 2020).
- Versus simulated/heteroscedastic synthetic schemes: ELD’s inclusion of long-tailed read noise, explicit banding, and calibration complexities better matches real-world extreme low-light data. In contrast, Gaussian+Poisson models systematically underestimate rare, structured noise events.
- Ground-truth validity: ELD’s ground truths are single-exposure, high-SNR RAWs—no temporal stacking, which can introduce ghosting or averaging artifacts in other datasets.
The ELD dataset, by modeling the full sensor noise pipeline and calibrating across multiple devices, is a pivotal resource for the systematic benchmarking, ablation, and deployment of denoising models tailored to the raw imaging domain under extreme darkness (Wei et al., 2020, Wei et al., 2021). Its synthetic pairing protocol further addresses the practical barrier of generalizing denoisers to new hardware platforms without laborious real-data acquisition, supporting advances in robust, physically-rooted learning-based image processing.