Visible/Thermal-Infrared Composites
- Visible/Thermal-Infrared Composites are multimodal systems that integrate visible detail and thermal signatures to provide enhanced imaging and material performance.
- Advanced fusion methods, including Bayesian, deep learning, and saliency-guided techniques, optimize edge preservation and thermal detection across modalities.
- Physical implementations utilize multilayer photonic structures, IR-transparent textiles, and electrochromic materials to achieve dynamic spectral control and adaptive camouflage.
Visible/Thermal-Infrared Composites are multimodal structures, algorithms, and materials that spatially or temporally integrate information from the visible (VIS: ≈ 400–780 nm) and thermal-infrared (IR: ≈ 3–15 μm) spectral regions. These composites are foundational in imaging, remote sensing, photonic device engineering, and adaptive camouflage. Their central goal is to synthesize physical structures or digital images that simultaneously leverage the complementary contrast mechanisms of the two bands: high spatial and textural acuity from visible, and radiative/thermal signatures from infrared.
1. Fundamental Principles and Objectives
The creation of visible/thermal-infrared composites is driven by the inherent differences and complementarities between the two spectral bands. Visible imaging captures fine-grained reflectance and texture, but is limited by low-light and obscurants; thermal-infrared imaging is insensitive to illumination, providing direct access to emissivity and temperature variations, but typically lacks textural richness and suffers from lower spatial resolution. Effective VIS/IR composites seek to:
- Preserve sharp gradients, edges, and textures as present in the visible band.
- Retain salient thermal features and highlight emitting objects from the IR band.
- Maintain geometric consistency and registration across modalities.
- Provide interpretable, artifact-minimized representations suitable for both human and automated interpretation.
- In physical composites, achieve selective spectral transmittance or emissivity for purposes such as personal cooling, camouflage, or adaptive thermoregulation.
The field encompasses algorithmic fusion at the pixel, feature, or score level (Yang et al., 5 May 2025, Chen et al., 27 Jun 2024, Elias et al., 13 Dec 2025, Zhao et al., 2020), physical multilayer and photonic material designs (Dang et al., 2021, Tong et al., 2015, Mandal et al., 2019), and neural representations (Sun et al., 20 Jun 2025, Lin et al., 22 Jul 2024).
2. Algorithmic Image Fusion Methodologies
A diverse set of digital approaches have been developed, often categorized as pixel-based, transform-based, and learning-based techniques.
2.1 Bayesian and Model-based Methods
The Bayesian fusion approach (Zhao et al., 2020) frames image fusion as a regression problem with the objective
where (IR) and (VIS) are pre-registered inputs, and is the fused image. This is reformulated in a hierarchical Bayesian framework, introducing latent variables for uncertainty modeling and imposing a total-variation (TV) prior to preserve visible-image edges:
The fusion is optimized via an EM algorithm with half-quadratic splittings. Experimentally, this method achieves high mutual information (MI), standard deviation (SD), and edge-preservation (Q{AB/F}) metrics on standard datasets, demonstrating superior retention of both thermal targets and textural detail.
2.2 Deep Learning and Attention-Based Fusion
Modern methods frequently employ dual-stream or multi-stream convolutional neural networks (CNNs) or auto-encoders:
- Dual-stream AE with attention fusion (Zhao et al., 2020): The encoder decomposes both VIS and IR into base (low-frequency) and detail (high-frequency) maps. Attention-based fusion combines these, leveraging saliency-weighted spatial maps and channel-wise pooling. The decoder reconstructs the final composite. This approach achieves top performance across entropy, spatial frequency, and information fidelity, confirming the efficacy of learned decomposition plus adaptive attention.
- Retinex-based multistream CNNs (Chen et al., 27 Jun 2024): Both modalities are decomposed into illumination and reflectance components, fused additively at the feature level, and recombined to reconstruct the composite image. This achieves leading metrics (entropy, detail-richness, mutual information) on TNO and VOT2020-RGBT benchmarks.
- Quaternion domain fusion (Yang et al., 5 May 2025): RGB channels are jointly encoded as quaternions, with all structural, gradient, and Bayesian operations performed in hypercomplex algebra. This holistically preserves color-thermal correlations and outperforms real-valued approaches on detail, spatial frequency, and artifact suppression, especially in color-shifted or low-visibility scenes.
- Implicit Neural Representations (INR) (Sun et al., 20 Jun 2025): An MLP models the fused image as a continuous spatial function. Inputs are normalized (x, y, I_IR, I_VIS), outputs are fused RGB, and losses include pixel, gradient, and regularization terms. The model enables resolution-independent fusion and direct super-resolution, leading all benchmarks in information fidelity (VIF), multiscale SSIM, and entropy.
2.3 Region-driven and Saliency-guided Methods
Co-fusion approaches utilize region saliency or multi-exposure guidance:
- Perceptual region-driven fusion with SVE cameras (Tao et al., 6 Dec 2025): Multi-exposure visible images and a registered IR frame are fused according to spatially-varying saliency maps—computed via intensity, contrast, and variance cues. Retinex decompositions are fused via adaptive pyramids. Structural similarity compensation is performed via local SSIM, with regional weights increasing IR prominence in visually uncertain regions. This results in top mutual information and visual fidelity, especially under dynamic range extremes and environmental haze.
- Pixel-level saliency and statistical weighting (Elias et al., 13 Dec 2025): Low/high-frequency decompositions are performed (mean filtering, anisotropic diffusion, or undecimated wavelet/Karhunen-Loève transforms). Saliency or local covariance yields adaptive weights for fusion. These methods are quantitatively shown to double feature matches and loop closures for SLAM tasks in extreme low-illumination, reducing navigation error by nearly an order of magnitude.
2.4 Generative and Diffusion-based Synthesis
Instead of fusing simultaneous inputs, visible-to-thermal translation is achieved via diffusion models guided by foundation models (Paranjape et al., 3 Apr 2025). F-ViTA conditions a diffusion denoising process on zero-shot semantic masks and labels (from SAM and Grounded DINO), synthesizing physically-plausible IR representations aligned with the scene's content. These synthetic IR images are then blended with VIS via alpha-compositing, serving multiple downstream tasks.
3. Physical and Material Visible/Thermal-Infrared Composites
Visible/thermal-infrared composites are realized as engineered materials, often targeting adaptive camouflage, personal or building thermal management, or selective radiative cooling. Two prominent design motifs are multilayer photonic structures and electrochromic composites.
3.1 Photonic Crystal Structures
ZnO/Ag/ZnO trilayer photonic crystals patterned with periodic sub-micron apertures achieve simultaneous visible transparency and IR-management (Dang et al., 2021). The stack parameters (aperture D=3 μm, period P=5.5 μm) exploit Fabry–Pérot and extraordinary optical transmission effects:
- Visible domain: Multilayer transfer-matrix analysis predicts >80% visible-band transmittance, confirmed experimentally.
- Thermal-IR domain: Patterned Ag supports transmission minima (ε < 0.1) in the 3–5 μm and 8–14 μm atmospheric windows (camouflage), while the 5–8 μm band exhibits ε_peak ≈ 0.8 (for radiative cooling).
- Thermal management: Under 0.4 W/cm² heating, the photonic crystal achieves ΔT ≈ 12.2 K lower equilibrium temperature versus low-ε control, via enhanced mid-IR emission.
3.2 Infrared-Transparent, Visible-Opaque Fabrics
Polyethylene microfiber textiles are designed to maximize mid/far-IR transmission by suppressing Rayleigh scattering (D_f ≪ 10 μm) while maintaining visible opaqueness through Mie scattering (D_f ≈ 1 μm) (Tong et al., 2015). Full-wave FEM simulations confirm that a 1 μm fiber–30 μm yarn design achieves:
- , (mid-IR);
- , .
- Direct radiative cooling through the fabric channeling >23 W/m² at 26.1 °C ambient.
3.3 Electrochromic Broadband Modulators
Li₄Ti₅O₁₂ (LTO) nanoparticle-based electrodes exhibit electrically-tunable reflectance/emittance across solar, MWIR, and LWIR bands (Mandal et al., 2019). Li-intercalation modulates carrier density and consequently Drude free-carrier absorption:
- ΔR_solar ≈ 0.74; Δε_MWIR ≈ 0.68; Δε_LWIR ≈ 0.3.
- Reversible switching is retained over >100 cycles.
- Demonstrated applicability to hybrid thermal-management (switchable heating/cooling) and dynamic IR camouflage.
4. Quantitative Metrics and Benchmarking
Robust assessment of VIS/IR composites utilizes both image-processing and physically-motivated criteria, including:
| Metric | Domain | Significance |
|---|---|---|
| Entropy (EN) | Algorithmic | Measures information content and variety in the fusion image |
| Mutual Information (MI) | Algorithmic | Shared information between fused and source images |
| Standard Deviation (SD) | Algorithmic | Contrast/detail richness |
| Structural Similarity (SSIM, MS-SSIM) | Algorithmic | Structural and perceptual fidelity; multiscale extensions evaluate quality at different scales |
| Visual Information Fidelity (VIF) | Algorithmic | Amount of source information preserved in the composite |
| Average Gradient (AG), Spatial Frequency (SF) | Algorithmic | Reflects sharpness of features and edge retention |
| Emissivity (ε) | Physical | Governs radiative cooling/camouflage efficacy |
| Transmittance (T), Reflectance (R) | Physical | Control visible and IR pass/block functions |
In benchmarking across public datasets (TNO, VOT2020-RGBT, FLIR, KAIST), leading fusion approaches show state-of-the-art performance in MI, EN, SD, SSIM, and VIF, while physical structures are characterized by their spectral ε, T, R and by direct thermal modulation and camouflage demonstrations.
5. Applications, Challenges, and Prospective Directions
Visible/thermal-infrared composites have broad utility in autonomous navigation (Elias et al., 13 Dec 2025), surveillance (Malviya et al., 2010), biometric recognition (Espinosa-Duró et al., 2022), adaptive camouflage (Dang et al., 2021, Mandal et al., 2019), personal cooling (Tong et al., 2015), and photogrammetry (Tao et al., 6 Dec 2025).
Applications
- Multimodal navigation: High-fidelity fused images improve loop closures and pose accuracy in SLAM under eclipse or poor visibility (Elias et al., 13 Dec 2025).
- Energy management: Passively cooled clothing and smart facades reduce HVAC loads (Tong et al., 2015, Mandal et al., 2019).
- Scene understanding and segmentation: Feature fusion boosts downstream detection, tracking, and semantic parsing in robotics and automotive contexts (Chen et al., 27 Jun 2024, Lin et al., 22 Jul 2024, Paranjape et al., 3 Apr 2025).
- Dynamic spectral camouflage: Materials can spectrally match their background for adaptive obfuscation in MWIR/LWIR (Dang et al., 2021).
Challenges
- Accurate multi-sensor registration and calibration remains nontrivial, particularly with cross-resolution or moving targets (Zhao et al., 2020, Sun et al., 20 Jun 2025, Lin et al., 22 Jul 2024).
- Physical composites often face fabrication constraints—e.g., achieving uniform nanopore or fiber diameter, scalability for large-area deployment (Dang et al., 2021, Tong et al., 2015).
- Current fusion algorithms may require manual tuning of loss weights and lack robustness to extreme conditions (e.g., ultra-low-contrast IR, scene dynamics) (Chen et al., 27 Jun 2024, Tao et al., 6 Dec 2025).
- Color mapping and perceptual consistency are open issues, especially for composites targeting human interpretation (Yang et al., 5 May 2025).
Prospects
Planned advances include joint fusion-classification frameworks, learning-based hyperparameter tuning, foundation-model guided translation, multispectral extension beyond VIS/IR (e.g., SWIR, MWIR, X-ray), and material integration strategies for multi-band adaptive control (Paranjape et al., 3 Apr 2025, Tao et al., 6 Dec 2025). The continued convergence of algorithmic, neural, and materials research is expected to propel the field toward more functionally-integrated, context-aware, and spectrally-adaptive composites.
6. References to Notable Works
The field of visible/thermal-infrared composites spans the following representative works:
- Bayesian regression fusion: "Bayesian Fusion for Infrared and Visible Images" (Zhao et al., 2020)
- Retinex-CNN decomposition: "SimpleFusion: A Simple Fusion Framework for Infrared and Visible Images" (Chen et al., 27 Jun 2024)
- Hypercomplex algebra: "Quaternion Infrared Visible Image Fusion" (Yang et al., 5 May 2025)
- Neural radiance fields for multispectral rendering: "ThermalNeRF: Thermal Radiance Fields" (Lin et al., 22 Jul 2024)
- Physical photonic structures: "A visible - infrared compatible camouflage photonic crystal with enhanced emission in 5~8 μm" (Dang et al., 2021)
- Electrochromic modulation: "Li4Ti5O12: A Visible-to-Infrared Broadband Electrochromic Material" (Mandal et al., 2019)
- Region-based and adaptive co-fusion: "Perceptual Region-Driven Infrared-Visible Co-Fusion for Extreme Scene Enhancement" (Tao et al., 6 Dec 2025)
These works together define the algorithmic, physical, and application-centered landscape of visible/thermal-infrared composites.