Invisible Gas Detection: An RGB-Thermal Cross Attention Network and A New Benchmark (2403.17712v2)
Abstract: The widespread use of various chemical gases in industrial processes necessitates effective measures to prevent their leakage during transportation and storage, given their high toxicity. Thermal infrared-based computer vision detection techniques provide a straightforward approach to identify gas leakage areas. However, the development of high-quality algorithms has been challenging due to the low texture in thermal images and the lack of open-source datasets. In this paper, we present the RGB-Thermal Cross Attention Network (RT-CAN), which employs an RGB-assisted two-stream network architecture to integrate texture information from RGB images and gas area information from thermal images. Additionally, to facilitate the research of invisible gas detection, we introduce Gas-DB, an extensive open-source gas detection database including about 1.3K well-annotated RGB-thermal images with eight variant collection scenes. Experimental results demonstrate that our method successfully leverages the advantages of both modalities, achieving state-of-the-art (SOTA) performance among RGB-thermal methods, surpassing single-stream SOTA models in terms of accuracy, Intersection of Union (IoU), and F2 metrics by 4.86%, 5.65%, and 4.88%, respectively. The code and data can be found at https://github.com/logic112358/RT-CAN.
- M. Meribout, Gas leak-detection and measurement systems: Prospects and future trends, IEEE Transactions on Instrumentation and Measurement 70 (2021) 1–13.
- S. R. Morrison, Mechanism of semiconductor gas sensor operation, Sensors and Actuators 11 (1987) 283–287.
- Spectral imaging applications: remote sensing, environmental monitoring, medicine, military operations, factory automation, and manufacturing, in: 25th AIPR Workshop: Emerging Applications of Computer Vision, volume 2962, SPIE, 1997, pp. 63–77.
- An infrared image enhancement algorithm for gas leak detecting based on gaussian filtering and adaptive histogram segmentation, in: 2021 IEEE International Conference on Real-time Computing and Robotics (RCAR), IEEE, 2021, pp. 359–363.
- An effective method for gas-leak area detection and gas identification with mid-infrared image, in: Photonics, volume 9, MDPI, 2022, p. 992.
- Machine vision for natural gas methane emissions detection using an infrared camera, Applied Energy 257 (2020) 113998.
- Videogasnet: Deep learning for natural gas methane leak classification using an infrared camera, Energy 238 (2022) 121516.
- Gas plume detection in infrared image using mask r-cnn with attention mechanism, in: AOPC 2019: AI in Optics and Photonics, volume 11342, SPIE, 2019, pp. 204–209.
- Mfnet: Towards real-time semantic segmentation for autonomous vehicles with multi-spectral scenes, in: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, 2017, pp. 5108–5115.
- Explicit attention-enhanced fusion for rgb-thermal perception tasks, IEEE Robotics and Automation Letters (2023).
- Multispectral fusion transformer network for rgb-thermal urban scene semantic segmentation, IEEE Geoscience and Remote Sensing Letters 19 (2022) 1–5.
- Feanet: Feature-enhanced attention network for rgb-thermal real-time semantic segmentation, in: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, 2021, pp. 4467–4473.
- The hitran2020 molecular spectroscopic database, Journal of quantitative spectroscopy and radiative transfer 277 (2022) 107949.
- J.-P. Tarel, N. Hautière, Fast visibility restoration from a single color or gray level image, in: 2009 IEEE 12th International Conference on Computer Vision, 2009, pp. 2201–2208. doi:10.1109/ICCV.2009.5459251.
- Deep residual learning for image recognition, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.
- PyTorch: An Imperative Style, High-Performance Deep Learning Library, in: H. Wallach, H. Larochelle, A. Beygelzimer, F. d’Alché Buc, E. Fox, R. Garnett (Eds.), Advances in Neural Information Processing Systems 32, Curran Associates, Inc., 2019, pp. 8024–8035. URL: http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf.
- Imagenet: A large-scale hierarchical image database, in: 2009 IEEE conference on computer vision and pattern recognition, Ieee, 2009, pp. 248–255.
- V-net: Fully convolutional neural networks for volumetric medical image segmentation, in: 2016 fourth international conference on 3D vision (3DV), Ieee, 2016, pp. 565–571.
- K. Yi, J. Wu, Probabilistic end-to-end noise correction for learning with noisy labels, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 7017–7025.
- Pytorch: An imperative style, high-performance deep learning library, Advances in neural information processing systems 32 (2019).
- Pyramid scene parsing network, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 2881–2890.
- Segformer: Simple and efficient design for semantic segmentation with transformers, Advances in Neural Information Processing Systems 34 (2021) 12077–12090.
- Rtfnet: Rgb-thermal fusion network for semantic segmentation of urban scenes, IEEE Robotics and Automation Letters 4 (2019) 2576–2583.