Spectrum Aware Illumination Estimation Using Multispectral Image

Published 12 Jun 2026 in eess.IV and cs.CV | (2606.14248v1)

Abstract: Multispectral (MS) imaging extends beyond conventional RGB imaging by capturing more spectral bands, thereby improving illuminant spectrum estimation (ISE). However, existing methods often fail to fully exploit spectral information, resulting in suboptimal performance under diverse lighting conditions and across different sensor domains. Hence, we propose a deep learning framework with a spatio-spectral feature extraction block, which incorporates spectral attention mechanisms to enhance spectral correlation and preserve illuminant-relevant spatial features. Through the inclusion of an illuminant prior (IP), our approach prioritizes specific channels that provide more meaningful information in an MS image. We also propose a spectral-domain transform across different MS sensor spaces. The results demonstrate that illuminant spectra learned in high-dimensional sensor spaces can be effectively transformed to various lower-dimensional camera sensor spaces without any additional training. To facilitate evaluation, we introduce a real-world MS dataset containing high-dimensional ground-truth illumination spectra captured under diverse lighting conditions. Through extensive experiments, we demonstrate that our method achieves superior accuracy compared to existing models, thus providing a practical solution for real-world ISE. The code and dataset are available at https://github.com/hyejin5/Spectrum-Aware-Illumination-Estimation-Using-Multispectral-Image.

Abstract PDF Upgrade to Chat

Authors (5)

Summary

The paper introduces a novel deep learning framework integrating spatio-spectral attention blocks and illuminant priors for high-dimensional SPD estimation.
It leverages sensor-agnostic spectral-domain transformation, reducing mean angular errors and generalizing effectively across different MS sensor domains.
Empirical results on the MILD and BeyondRGB datasets confirm robust SPD recovery and improved white-balancing under challenging illumination conditions.

Spectrum-Aware Deep Illuminant Estimation from Multispectral Images

Introduction

The paper "Spectrum Aware Illumination Estimation Using Multispectral Image" (2606.14248) addresses the challenge of accurate illuminant spectral power distribution (SPD) estimation from multispectral (MS) images, which is critical for downstream tasks such as color constancy, color rendering, and robust computer vision pipelines. The authors propose a deep learning framework integrating spatio-spectral attention mechanisms and illuminant priors into the feature extraction pipeline, enabling effective spectral correlation exploitation and generalization across MS sensor domains. Additionally, a new dataset, MILD, is introduced, capturing diverse lighting conditions—including challenging monochromatic cases—with high-dimensional ground-truth SPD measured using a spectroradiometer.

Architectural Overview

The proposed framework departs from prior methods by explicitly modeling spectral inter-channel relationships within MS images using two successive Spatio-Spectral Feature Extractors (SSFEs), integrating spectral attention blocks and an illuminant prior (IP). The network estimates a high-dimensional illuminant SPD and applies target-domain linear projections via physically defined matrices, such as sensor spectral sensitivity functions (CSF) and color-matching functions (CMF), to allow deployment across heterogeneous camera domains without retraining.

Figure 1: Overall pipeline illustrating high-dimensional illuminant SPD estimation and linear projection to target domains via CSF/CMF matrices.

The architectural backbone leverages 3D convolutions for joint spatial and spectral processing, followed by two spectral attention modules:

Spectral Attention Block with Illuminant Prior (SABIP): Exploits channel-wise spectral correlations, focusing on channels informative for illuminant estimation; incorporates the spatial mean IP vector as a physical cue, shown to be highly correlated with scene illumination.
Multihead Spectral Self Attention Block (MSSAB): Adopts a transformer-style scheme for global spectral self-attention, capturing inter-channel dependencies across spectral resolution.
Figure 2: Network architecture highlighting dual SSFE stages with SABIP and MSSAB modules.

MILD Dataset and Sensor Characteristics

The MILD dataset comprises 15-channel MS images sampled from 380–835 nm, acquired under both natural and artificially synthesized illuminants—including 42 mono-wavelength sources that deviate substantially from the Planckian locus, posing severe challenges for RGB-based approaches.

Figure 3: MILD dataset scene examples, top: without reference, bottom: with reference color charts and white standards.

Each image is accompanied by spectroradiometer-measured GT SPD (36 channels, 380–730 nm). The sensor response curves for each channel are profiled, enabling principled spectral-domain transformation via CSF matrices.

Figure 4: Normalized spectral response for each MS sensor channel over [0,1], characterizing channel selectivity.

Mapping of illuminants in chromaticity space visualizes the dataset's coverage, emphasizing its spectrum diversity and inclusion of non-standard mono-wavelength sources.

Figure 5: CIE-xy chromaticity coordinates for the MILD dataset's illuminants; (a) non-mono-wavelength, (b) mono-wavelength spectra.

Spectral-Domain Transformation and Generalization

A key contribution is the sensor-agnostic spectral-domain transformation, enabling a single high-dimensional illumination estimator to generalize across arbitrary sensor domains and color spaces. Linear mappings via CSF and CMF matrices ensure that only the target-observable spectral eigen-directions contribute, as demonstrated via SVD analysis. This approach circumvents the need for retraining per sensor, underpinning practical deployments.

Empirical Results and Benchmarking

Extensive experiments demonstrate the superiority and stability of the proposed method. On the BeyondRGB dataset, the framework achieves a mean angular error (AE) of 2.51° (lab) and 4.92° (field), outperforming previous SOTA (BeyondRGB: 5.92° and 7.22°). On MILD, the mean AE drops to 3.18°, with robust estimation in mono-wavelength cases (MILD(m): 7.44°). Ablation studies confirm substantial performance gains from spectral attention and IP integration.

Qualitative comparisons on extreme spectral scenes show near-perfect curve fitting to ground-truth SPDs, maintaining distinct spectral peaks where RGB-based models fail.

Figure 6: Qualitative comparison of SPD estimation across methods on MILD; proposed model accurately fits mono-wavelength ground truth.

White balancing using estimated SPD further demonstrates practical utility, producing visually stable color renderings under challenging illumination.

Figure 7: White-balancing results on MILD; proposed model achieves closest color match to GT illuminant-corrected image.

Physical Modeling for Multi-Source and Intensity Estimation

Extensibility is validated using physical light attenuation models (Yuksel's point attenuation), combining estimated SPD and spatial intensity fitting for both single and multi-source scenarios. The fitted models achieve $R^2 > 0.97$ and low AE (direct: 2.48°–5.57°), confirming the estimated spectral shape's sufficiency for intensity-level modeling and spectral superposition in controlled environments.

Figure 8: Single-source illumination estimation with spatial attenuation model; direct regions exhibit strong agreement with fitted SPD.

Figure 9: Multi-source spectral superposition verification and spatial intensity fitting, confirming linear combination and accurate intensity recovery.

Complexity and Ablation

Complexity analysis shows negligible overhead for high-dimensional SPD estimation, with only a minor increase in parameter count and inference time compared to direct low-dimensional prediction. Ablation studies highlight the impact of spectral attention and physical prior integration, with IP providing non-trivial gains over random or learnable vectors.

Implications and Future Directions

The framework provides robust, generalizable SPD estimation across MS sensor domains, supporting critical vision pipelines such as color correction and white balancing in real-world, non-standard illumination. The sensor-agnostic spectral-domain projection paves the way for versatile single-model deployments, while physical attenuation integration facilitates modeling of spatially varying illumination and multi-source environments.

Future work should extend spatially varying illuminant estimation to unconstrained environments, incorporating neural source detection, occlusion-aware attenuation, and real-world indirect illumination modeling. Further research on end-to-end spatial spectral estimation and integration with downstream vision models is warranted.

Conclusion

This work establishes a scalable, physically grounded paradigm for MS-based illuminant estimation, combining deep spectral attention, physical priors, and sensor-agnostic transformation to achieve robust SPD recovery and practical deployment. The MILD dataset, with its diversity of lighting scenarios, also contributes substantial benchmarks for advancing research in spectral imaging and color constancy.

Markdown Report Issue