RFN-Nest: An end-to-end residual fusion network for infrared and visible images (2103.04286v2)

Published 7 Mar 2021 in cs.CV

Abstract: In the image fusion field, the design of deep learning-based fusion methods is far from routine. It is invariably fusion-task specific and requires a careful consideration. The most difficult part of the design is to choose an appropriate strategy to generate the fused image for a specific task in hand. Thus, devising learnable fusion strategy is a very challenging problem in the community of image fusion. To address this problem, a novel end-to-end fusion network architecture (RFN-Nest) is developed for infrared and visible image fusion. We propose a residual fusion network (RFN) which is based on a residual architecture to replace the traditional fusion approach. A novel detail-preserving loss function, and a feature enhancing loss function are proposed to train RFN. The fusion model learning is accomplished by a novel two-stage training strategy. In the first stage, we train an auto-encoder based on an innovative nest connection (Nest) concept. Next, the RFN is trained using the proposed loss functions. The experimental results on public domain data sets show that, compared with the existing methods, our end-to-end fusion network delivers a better performance than the state-of-the-art methods in both subjective and objective evaluation. The code of our fusion method is available at https://github.com/hli1221/imagefusion-rfn-nest

Citations (573)

View on Semantic Scholar

Summary

The paper introduces RFN-Nest, integrating a residual fusion module with a UNet++-like decoder to seamlessly combine infrared and visible images.
It employs a two-stage training process with detail-preserving and feature-enhancing loss functions to optimize feature extraction and image reconstruction.
Experimental results on TNO and VOT datasets demonstrate superior fusion performance, with notable improvements in entropy, standard deviation, and mutual information.

Overview of RFN-Nest: An End-to-End Residual Fusion Network for Infrared and Visible Images

The paper presents a novel method for infrared and visible image fusion, proposing an end-to-end network named RFN-Nest. The proposed architecture leverages a Residual Fusion Network (RFN) to improve upon traditional fusion strategies by utilizing deep learning techniques to enhance feature extraction and image reconstruction capabilities.

Key Components and Methodology

Network Architecture:
- The RFN-Nest is composed of an encoder, decoder, and the core component, the Residual Fusion Network (RFN). The encoder extracts multi-scale features, which are then fused using RFN before being decoded into a single image.
- The decoder employs a nest connection, resembling UNet++ architecture, to efficiently reconstruct the fused image.
Training Strategy:
- A two-stage training process is employed. Initially, the encoder and decoder are trained as an auto-encoder using pixel and SSIM losses to ensure robust feature extraction and reconstruction capabilities.
- The RFN is then trained separately, with a novel loss function designed to preserve details from visible images and salient features from infrared images.
Loss Functions:
- A detail-preserving loss function ( $L_{detail}$ ) and a feature-enhancing loss function ( $L_{feature}$ ) are introduced to optimize the RFN, ensuring a comprehensive fusion strategy that balances detail retention and feature saliency.

Experimental Results and Comparisons

Performance Evaluation:
- The RFN-Nest was evaluated on datasets collected from TNO and VOT2020-RGBT, demonstrating superior performance in both subjective and objective evaluations compared to existing methodologies.
- Metrics such as entropy, standard deviation, and mutual information were used to measure the quality of fusion, showing RFN-Nest's ability to deliver visually pleasing and information-rich fused images.
Application in RGBT Tracking:
- To illustrate the efficacy of the RFN, it was integrated into a state-of-the-art object tracker (AFAT), enhancing tracking performance in challenging multi-modal scenarios.
- The tracker showed improved results on the VOT2019 and VOT2020-RGBT datasets, indicating the broader applicability of the RFN-Nest architecture beyond image fusion.

Implications and Future Work

The introduction of RFN-Nest highlights the potential of leveraging deep learning for optimal image fusion strategies. By enabling end-to-end training and ensuring adaptability through learnable fusion mechanisms, RFN-Nest sets a precedent for the development of robust fusion networks applicable to diverse tasks such as surveillance, autonomous driving, and advanced tracking systems.

Future research may focus on expanding training datasets or integrating attention mechanisms to further enhance feature extraction and fusion precision. Additionally, exploring RFN's adaptability in other multi-modal contexts or its integration into complex vision systems could open new avenues of research and application.

In conclusion, the RFN-Nest framework offers a sophisticated and practical approach to the image fusion problem, marking a significant advancement over traditional fusion methodologies by effectively blending state-of-the-art network architectures with customized loss functions for enhanced performance and applicability.

PDF Markdown

Related Papers

GitHub

GitHub - hli1221/imagefusion-rfn-nest: RFN-Nest(Information Fusion, 2021, Highly Cited Paper) - PyTorch =1.5，Python=3.7 (116 stars)