Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

EANet: Enhanced Attribute-based RGBT Tracker Network (2307.01893v1)

Published 4 Jul 2023 in cs.CV

Abstract: Tracking objects can be a difficult task in computer vision, especially when faced with challenges such as occlusion, changes in lighting, and motion blur. Recent advances in deep learning have shown promise in challenging these conditions. However, most deep learning-based object trackers only use visible band (RGB) images. Thermal infrared electromagnetic waves (TIR) can provide additional information about an object, including its temperature, when faced with challenging conditions. We propose a deep learning-based image tracking approach that fuses RGB and thermal images (RGBT). The proposed model consists of two main components: a feature extractor and a tracker. The feature extractor encodes deep features from both the RGB and the TIR images. The tracker then uses these features to track the object using an enhanced attribute-based architecture. We propose a fusion of attribute-specific feature selection with an aggregation module. The proposed methods are evaluated on the RGBT234 \cite{LiCLiang2018} and LasHeR \cite{LiLasher2021} datasets, which are the most widely used RGBT object-tracking datasets in the literature. The results show that the proposed system outperforms state-of-the-art RGBT object trackers on these datasets, with a relatively smaller number of parameters.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (25)
  1. C. Li, X. Liang, Y. Lu, N. Zhao, and J. Tang, “Rgb-t object tracking: Benchmark and baseline,” Pattern Recognition 96, 106977 (2019).
  2. C. Li, W. Xue, Y. Jia, Z. Qu, B. Luo, J. Tang, and D. Sun, “LasHer: A large-scale high-diversity benchmark for RGBT tracking,” IEEE Transactions on Image Processing 31, 392–404 (2022).
  3. Y. Xiao, M. Yang, C. Li, L. Liu, and J. Tang, “Attribute-based progressive fusion network for RGBT tracking,” Proceedings of the AAAI Conference on Artificial Intelligence 36, 2831–2838 (6 2022).
  4. G. Chen, J. Zhang, Y. Liu, J. Yin, X. Yin, L. Cui, and Y. Dai, “ESKNet-an enhanced adaptive selection kernel convolution for breast tumors segmentation,” arXiv eess.IV 2211.02915 (2022).
  5. K. I. Danaci and E. Akagunduz, “A survey on infrared image and video sets,” arXiv cs.CV 2203.08581 (2022).
  6. Z. Tang, T. Xu, and X.-J. Wu, “A survey for deep RGBT tracking,” arXiv cs.CV 2201.09296 (2022).
  7. H. Nam and B. Han, “Learning multi-domain convolutional neural networks for visual tracking,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , 4293–4302 (2016).
  8. Y. Zhu, C. Li, J. Tang, and B. Luo, “Quality-aware feature aggregation network for robust RGBT tracking,” IEEE Transactions on Intelligent Vehicles 6(1), 121–130 (2021).
  9. Y. Zhu, C. Li, B. Luo, J. Tang, and X. Wang, “Dense feature aggregation and pruning for RGBT tracking,” in Proceedings of the 27th ACM International Conference on Multimedia , ACM (oct 2019).
  10. C. Wang, C. Xu, Z. Cui, L. Zhou, T. Zhang, X. Zhang, and J. Yang, “Cross-modal pattern-propagation for rgb-t tracking,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , 7062–7071 (2020).
  11. R. Yang, Y. Zhu, X. Wang, C. Li, and J. Tang, “Learning target-oriented dual attention for robust rgb-t tracking,” in 2019 IEEE International Conference on Image Processing (ICIP) , 3975–3979 (2019).
  12. P. Zhang, J. Zhao, C. Bo, D. Wang, H. Lu, and X. Yang, “Jointly modeling motion and appearance cues for robust rgb-t tracking,” IEEE Transactions on Image Processing 30, 3335–3347 (2021).
  13. A. Lu, C. Li, Y. Yan, J. Tang, and B. Luo, “RGBT tracking via multi-adapter network with hierarchical divergence loss,” IEEE Transactions on Image Processing 30, 5613–5625 (2021).
  14. Q. Xu, Y. Mei, J. Liu, and C. Li, “Multimodal cross-layer bilinear pooling for RGBT tracking,” IEEE Transactions on Multimedia 24, 567–580 (2022).
  15. Z. Tu, C. Lin, W. Zhao, C. Li, and J. Tang, “M5l: Multi-modal multi-margin metric learning for RGBT tracking,” IEEE Transactions on Image Processing 31, 85–98 (2022).
  16. H. Zhang, L. Zhang, L. Zhuo, and J. Zhang, “Object tracking in rgb-t videos using modal-aware attention network and competitive learning,” Sensors (Switzerland) 20 (1 2020).
  17. Y. Zhu, C. Li, J. Tang, B. Luo, and L. Wang, “RGBT tracking by trident fusion network,” IEEE Transactions on Circuits and Systems for Video Technology 32, 579–592 (2 2022).
  18. C. Li, L. Liu, A. Lu, Q. Ji, and J. Tang, “Challenge-aware RGBT tracking,” in Computer Vision – ECCV 2020 , A. Vedaldi, H. Bischof, T. Brox, and J.-M. Frahm, eds., 222–237, Springer International Publishing, Cham (2020).
  19. P. Zhang, D. Wang, H. Lu, and X. Yang, “Learning adaptive attribute-driven representation for real-time rgb-t tracking,” International Journal of Computer Vision 129, 2714–2729 (9 2021).
  20. X. Li, W. Wang, X. Hu, and J. Yang, “Selective kernel networks,” in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , 510–519, IEEE Computer Society, Los Alamitos, CA, USA (jun 2019).
  21. C. Li, X. Wu, N. Zhao, X. Cao, and J. Tang, “Fusing two-stream convolutional neural networks for rgb-t object tracking,” Neurocomputing 281, 78–85 (2018).
  22. K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” in 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings , Y. Bengio and Y. LeCun, eds. (2015).
  23. X. Glorot, A. Bordes, and Y. Bengio, “Deep sparse rectifier neural networks,” Journal of Machine Learning Research 15 (01 2010).
  24. C. Li, H. Cheng, S. Hu, X. Liu, J. Tang, and L. Lin, “Learning collaborative sparse representation for grayscale-thermal tracking,” IEEE Transactions on Image Processing 25, 5743–5756 (2016).
  25. C. L. Li, A. Lu, A. H. Zheng, Z. Tu, and J. Tang, “Multi-adapter RGBT tracking,” in 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW) , 2262–2270 (2019).
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Abbas Türkoğlu (1 paper)
  2. Erdem Akagündüz (20 papers)
Citations (2)