Bridging the Gap between Multi-focus and Multi-modal: A Focused Integration Framework for Multi-modal Image Fusion (2311.01886v2)
Abstract: Multi-modal image fusion (MMIF) integrates valuable information from different modality images into a fused one. However, the fusion of multiple visible images with different focal regions and infrared images is a unprecedented challenge in real MMIF applications. This is because of the limited depth of the focus of visible optical lenses, which impedes the simultaneous capture of the focal information within the same scene. To address this issue, in this paper, we propose a MMIF framework for joint focused integration and modalities information extraction. Specifically, a semi-sparsity-based smoothing filter is introduced to decompose the images into structure and texture components. Subsequently, a novel multi-scale operator is proposed to fuse the texture components, capable of detecting significant information by considering the pixel focus attributes and relevant data from various modal images. Additionally, to achieve an effective capture of scene luminance and reasonable contrast maintenance, we consider the distribution of energy information in the structural components in terms of multi-directional frequency variance and information entropy. Extensive experiments on existing MMIF datasets, as well as the object detection and depth estimation tasks, consistently demonstrate that the proposed algorithm can surpass the state-of-the-art methods in visual perception and quantitative evaluation. The code is available at https://github.com/ixilai/MFIF-MMIF.
- Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934, 2020.
- Plugging self-supervised monocular depth into unsupervised domain adaptation for semantic segmentation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 1129–1139, 2022.
- Aft: Adaptive fusion transformer for visible and infrared images. IEEE Transactions on Image Processing, 32:2077–2092, 2023.
- Infrared and visible image fusion based on target-enhanced multiscale transform decomposition. Information Sciences, 508:64–78, 2020.
- Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE transactions on pattern analysis and machine intelligence, 40(4):834–848, 2017.
- Video salient object detection via contrastive features and attention modules. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 1320–1329, January 2022.
- Detail preserved fusion of visible and infrared images using regional saliency extraction and multi-scale image decomposition. Optics Communications, 341:199–209, 2015.
- Image quality measures and their performance. IEEE Transactions on communications, 43(12):2959–2965, 1995.
- Boosting target-level infrared and visible image fusion with regional information coordination. Information Fusion, 92:268–288, 2023.
- Semi-sparsity for smoothing filters. IEEE Transactions on Image Processing, 32:1627–1639, 2023.
- Reconet: Recurrent correction network for fast and efficient multi-modality image fusion. In European Conference on Computer Vision, pages 539–555. Springer, 2022.
- Pedestrian tracking using online boosted random ferns learning in far-infrared imagery for safe driving at night. IEEE Transactions on Intelligent Transportation Systems, 18(1):69–81, 2016.
- Uifgan: An unsupervised continual-learning generative adversarial network for unified image fusion. Information Fusion, 88:305–318, 2022.
- An infrared and visible image fusion method based on multi-scale transformation and norm optimization. Information Fusion, 71:109–129, 2021.
- Mdlatlrr: A novel decomposition method for infrared and visible image fusion. IEEE Transactions on Image Processing, 29:4733–4746, 2020.
- Lrrnet: A novel representation learning guided fusion network for infrared and visible images. IEEE transactions on pattern analysis and machine intelligence, 2023.
- Drpl: Deep regression pair learning for multi-focus image fusion. IEEE Transactions on Image Processing, 29:4816–4831, 2020.
- Laplacian redecomposition for multimodal medical image fusion. IEEE Transactions on Instrumentation and Measurement, 69(9):6880–6890, 2020.
- Cbfm: Contrast balance infrared and visible image fusion based on contrast-preserving guided filter. Remote Sensing, 15(12):2969, 2023.
- Infrared and visible image fusion based on domain transform filtering and sparse representation. Infrared Physics & Technology, 131:104701, 2023.
- Multi-focus image fusion based on hessian matrix decomposition and salient difference focus detection. Entropy, 24(11):1527, 2022.
- Fusion from decomposition: A self-supervised decomposition approach for image fusion. In European Conference on Computer Vision, pages 719–735. Springer, 2022.
- Target-aware dual adversarial learning and a multi-scenario multi-modality benchmark to fuse infrared and visible for object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5802–5811, 2022.
- Learning a deep multi-scale feature ensemble and an edge-attention guidance for image fusion. IEEE Transactions on Circuits and Systems for Video Technology, 32(1):105–119, 2021.
- Objective assessment of multiresolution image fusion algorithms for context enhancement in night vision: a comparative study. IEEE transactions on pattern analysis and machine intelligence, 34(1):94–109, 2011.
- Swinfusion: Cross-domain long-range learning for general image fusion via swin transformer. IEEE/CAA Journal of Automatica Sinica, 9(7):1200–1217, 2022.
- Ddcgan: A dual-discriminator conditional generative adversarial network for multi-resolution image fusion. IEEE Transactions on Image Processing, 29:4980–4995, 2020.
- Ganmcc: A generative adversarial network with multiclassification constraints for infrared and visible image fusion. IEEE Transactions on Instrumentation and Measurement, 70:1–14, 2020.
- Infrared and visible image fusion based on visual saliency map and weighted least square optimization. Infrared Physics & Technology, 82:8–17, 2017.
- A total variation with joint norms for infrared and visible image fusion. IEEE Transactions on Multimedia, 24:1460–1472, 2021.
- A novel blind image quality assessment method based on refined natural scene statistics. In 2019 IEEE International Conference on Image Processing (ICIP), pages 1004–1008. IEEE, 2019.
- Empirical generalization study: Unsupervised domain adaptation vs. domain generalization methods for semantic segmentation in the wild. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 499–508, 2023.
- U2-net: Going deeper with nested u-structure for salient object detection. Pattern recognition, 106:107404, 2020.
- Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer. IEEE transactions on pattern analysis and machine intelligence, 44(3):1623–1637, 2020.
- Blind image quality assessment: A natural scene statistics approach in the dct domain. IEEE transactions on Image Processing, 21(8):3339–3352, 2012.
- Divfusion: Darkness-free infrared and visible image fusion. Information Fusion, 91:477–493, 2023.
- Image fusion in the loop of high-level vision tasks: A semantic-aware real-time infrared and visible image fusion network. Information Fusion, 82:28–42, 2022.
- Matr: multimodal medical image fusion via multiscale adaptive transformer. IEEE Transactions on Image Processing, 31:5134–5149, 2022.
- Alexander Toet et al. Tno image fusion dataset. Figshare. data, 2014.
- Towards online domain adaptive object detection. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 478–488, 2023.
- A novel image fusion metric based on multi-scale analysis. In 2008 9th international conference on signal processing, pages 965–968. IEEE, 2008.
- The evolution of video quality measurement: From psnr to hybrid metrics. IEEE transactions on Broadcasting, 54(3):660–668, 2008.
- U2fusion: A unified unsupervised image fusion network. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(1):502–518, 2020.
- Infrared and visible image fusion via parallel scene and texture learning. Pattern Recognition, 132:108929, 2022.
- Objective image fusion performance measure. Electronics letters, 36(4):308–309, 2000.
- Sparse norm filtering. arXiv preprint arXiv:1305.3971, 2013.
- Revisiting feature fusion for rgb-t salient object detection. IEEE Transactions on Circuits and Systems for Video Technology, 31(5):1804–1818, 2020.
- Interactive feature embedding for infrared and visible image fusion. IEEE Transactions on Neural Networks and Learning Systems, 2023.
- Cddfuse: Correlation-driven dual-branch feature decomposition for multi-modality image fusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5906–5916, 2023.
- Semantic-supervised infrared and visible image fusion via a dual-discriminator generative adversarial network. IEEE Transactions on Multimedia, 2021.
- A perceptual framework for infrared–visible image fusion based on multiscale structure decomposition and biological vision. Information Fusion, 93:174–191, 2023.