Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

BusReF: Infrared-Visible images registration and fusion focus on reconstructible area using one set of features (2401.00285v1)

Published 30 Dec 2023 in cs.CV and cs.AI

Abstract: In a scenario where multi-modal cameras are operating together, the problem of working with non-aligned images cannot be avoided. Yet, existing image fusion algorithms rely heavily on strictly registered input image pairs to produce more precise fusion results, as a way to improve the performance of downstream high-level vision tasks. In order to relax this assumption, one can attempt to register images first. However, the existing methods for registering multiple modalities have limitations, such as complex structures and reliance on significant semantic information. This paper aims to address the problem of image registration and fusion in a single framework, called BusRef. We focus on Infrared-Visible image registration and fusion task (IVRF). In this framework, the input unaligned image pairs will pass through three stages: Coarse registration, Fine registration and Fusion. It will be shown that the unified approach enables more robust IVRF. We also propose a novel training and evaluation strategy, involving the use of masks to reduce the influence of non-reconstructible regions on the loss functions, which greatly improves the accuracy and robustness of the fusion task. Last but not least, a gradient-aware fusion network is designed to preserve the complementary information. The advanced performance of this algorithm is demonstrated by

Definition Search Book Streamline Icon: https://streamlinehq.com
References (42)
  1. Unsupervised multi-modal image registration via geometry preserving image-to-image translation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
  2. A human perception inspired quality metric for image fusion based on regional information. Information Fusion, 8(2):193–207, 2007. Special Issue on Image Fusion: Advances in the State of the Art.
  3. Dynamic convolution: Attention over convolution kernels. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
  4. Unifusion: A lightweight unified image fusion network. IEEE Transactions on Instrumentation and Measurement, 70:1–14, 2021.
  5. Mufusion: A general unsupervised image fusion network based on memory unit. Information Fusion, 92:80–92, 2023.
  6. Phu-Hung Dinh. Medical image fusion based on enhanced three-layer image decomposition and chameleon swarm algorithm. Biomedical Signal Processing and Control, 84:104740, 2023.
  7. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
  8. Image quality measures and their performance. IEEE Transactions on Communications, 43(12):2959–2965, 1995.
  9. Fast-fmi: Non-reference image fusion metric. In 2014 IEEE 8th International Conference on Application of Information and Communication Technologies (AICT), pages 1–3, 2014.
  10. Hierarchical estimation of a dense deformation field for 3-d robust registration. IEEE transactions on medical imaging, 20(5):388–402, 2001.
  11. Medical image registration. Physics in medicine & biology, 46(3):R1, 2001.
  12. Spatial transformer networks. In Advances in Neural Information Processing Systems. Curran Associates, Inc., 2015.
  13. Indescribable multi-modal spatial evaluator. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 9853–9862, 2023.
  14. Densefuse: A fusion approach to infrared and visible images. IEEE Transactions on Image Processing, 28(5):2614–2623, 2019.
  15. Nestfuse: An infrared and visible image fusion architecture based on nest connection and spatial/channel attention models. IEEE Transactions on Instrumentation and Measurement, 69(12):9645–9656, 2020a.
  16. Mdlatlrr: A novel decomposition method for infrared and visible image fusion. IEEE Transactions on Image Processing, 29:4733–4746, 2020b.
  17. Rfn-nest: An end-to-end residual fusion network for infrared and visible images. Information Fusion, 73:72–86, 2021.
  18. Lrrnet: A novel representation learning guided fusion network for infrared and visible images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(9):11040–11052, 2023a.
  19. Feature dynamic alignment and refinement for infrared–visible image fusion: Translation robust fusion. Information Fusion, 95:26–41, 2023b.
  20. Dual-resolution correspondence networks. In Advances in Neural Information Processing Systems, pages 17346–17357. Curran Associates, Inc., 2020c.
  21. David G Lowe. Distinctive image features from scale-invariant keypoints. International journal of computer vision, 60:91–110, 2004.
  22. Multi-focus image fusion using hosvd and edge intensity. Journal of Visual Communication and Image Representation, 45:46–61, 2017.
  23. Stdfusionnet: An infrared and visible image fusion network based on salient target detection. IEEE Transactions on Instrumentation and Measurement, 70:1–13, 2021.
  24. Swinfusion: Cross-domain long-range learning for general image fusion via swin transformer. IEEE/CAA Journal of Automatica Sinica, 9(7):1200–1217, 2022.
  25. Comir: Contrastive multimodal image representation for registration. In Advances in Neural Information Processing Systems, pages 18433–18444. Curran Associates, Inc., 2020.
  26. Neighbourhood consensus networks. In Advances in Neural Information Processing Systems. Curran Associates, Inc., 2018.
  27. Efficient neighbourhood consensus networks via submanifold sparse convolutions. In Computer Vision – ECCV 2020, pages 605–621, Cham, 2020. Springer International Publishing.
  28. BK Shreyamsha Kumar. Multifocus and multispectral image fusion based on pixel significance using discrete cosine harmonic wavelet transform. Signal, Image and Video Processing, 7:1125–1143, 2013.
  29. Loftr: Detector-free local feature matching with transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 8922–8931, 2021.
  30. Unsupervised misaligned infrared and visible image fusion via cross-modality image generation and registration. arXiv preprint arXiv:2205.11876, 2022a.
  31. Matchformer: Interleaving attention in transformers for feature matching. In Proceedings of the Asian Conference on Computer Vision (ACCV), pages 2746–2762, 2022b.
  32. Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4):600–612, 2004.
  33. Semantics lead all: Towards unified image registration and fusion from a semantic perspective. Information Fusion, 98:101835, 2023.
  34. Rfnet: Unsupervised network for mutually reinforcing multi-modal image registration and fusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 19679–19688, 2022.
  35. Murf: Mutually reinforcing multi-modal image registration and fusion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(10):12148–12166, 2023.
  36. Image fusion meets deep learning: A survey and perspective. Information Fusion, 76:323–336, 2021.
  37. Interleaved group convolutions. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017.
  38. Image matching by normalized cross-correlation. In 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, pages II–II, 2006.
  39. Didfuse: Deep image decomposition for infrared and visible image fusion. arXiv preprint arXiv:2003.09210, 2020.
  40. Cddfuse: Correlation-driven dual-branch feature decomposition for multi-modality image fusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5906–5916, 2023.
  41. Gan review: Models and medical image fusion applications. Information Fusion, 91:134–148, 2023.
  42. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017.

Summary

We haven't generated a summary for this paper yet.