Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Mitigate Target-level Insensitivity of Infrared Small Target Detection via Posterior Distribution Modeling (2403.08380v1)

Published 13 Mar 2024 in cs.CV

Abstract: Infrared Small Target Detection (IRSTD) aims to segment small targets from infrared clutter background. Existing methods mainly focus on discriminative approaches, i.e., a pixel-level front-background binary segmentation. Since infrared small targets are small and low signal-to-clutter ratio, empirical risk has few disturbances when a certain false alarm and missed detection exist, which seriously affect the further improvement of such methods. Motivated by the dense prediction generative methods, in this paper, we propose a diffusion model framework for Infrared Small Target Detection which compensates pixel-level discriminant with mask posterior distribution modeling. Furthermore, we design a Low-frequency Isolation in the wavelet domain to suppress the interference of intrinsic infrared noise on the diffusion noise estimation. This transition from the discriminative paradigm to generative one enables us to bypass the target-level insensitivity. Experiments show that the proposed method achieves competitive performance gains over state-of-the-art methods on NUAA-SIRST, IRSTD-1k, and NUDT-SIRST datasets. Code are available at https://github.com/Li-Haoqing/IRSTD-Diff.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (57)
  1. J. Lu, Y. He, H. Li, and F. Lu, “Detecting small target of ship at sea by infrared image,” in 2006 IEEE International Conference on Automation Science and Engineering, 2006, pp. 165–169.
  2. M. Zhao, W. Li, L. Li, J. Hu, P. Ma, and R. Tao, “Single-frame infrared small-target detection: A survey,” IEEE Geoscience and Remote Sensing Magazine, vol. 10, no. 2, pp. 87–119, 2022.
  3. P. Demosthenous, C. Pitris, and J. Georgiou, “Infrared fluorescence-based cancer screening capsule for the small intestine,” IEEE transactions on biomedical circuits and systems, vol. 10, no. 2, pp. 467–476, 2015.
  4. S. D. Deshpande, M. H. Er, R. Venkateswarlu, and P. Chan, “Max-mean and max-median filters for detection of small targets,” in Signal and Data Processing of Small Targets 1999, vol. 3809, 1999, pp. 74–83.
  5. J.-F. Rivest and R. Fortin, “Detection of dim targets in digital infrared imagery by morphological image processing,” Optical Engineering, vol. 35, no. 7, pp. 1886–1893, 1996.
  6. J. Han, S. Moradi, I. Faramarzi, C. Liu, H. Zhang, and Q. Zhao, “A local contrast method for infrared small-target detection utilizing a tri-layer window,” IEEE Geoscience and Remote Sensing Letters, vol. 17, no. 10, pp. 1822–1826, 2019.
  7. J. Han, S. Moradi, I. Faramarzi, H. Zhang, Q. Zhao, X. Zhang, and N. Li, “Infrared small target detection based on the weighted strengthened local contrast measure,” IEEE Geoscience and Remote Sensing Letters, vol. 18, no. 9, pp. 1670–1674, 2020.
  8. L. Zhang and Z. Peng, “Infrared small target detection based on partial sum of the tensor nuclear norm,” Remote Sensing, vol. 11, no. 4, p. 382, 2019.
  9. Y. Sun, J. Yang, and W. An, “Infrared dim and small target detection via multiple subspace learning and spatial-temporal patch-tensor model,” IEEE Transactions on Geoscience and Remote Sensing, vol. 59, no. 5, pp. 3737–3752, 2020.
  10. Y. Dai, Y. Wu, F. Zhou, and K. Barnard, “Attentional local contrast networks for infrared small target detection,” IEEE Transactions on Geoscience and Remote Sensing, vol. 59, no. 11, pp. 9813–9824, 2021.
  11. X. Wu, D. Hong, and J. Chanussot, “Uiu-net: U-net in u-net for infrared small object detection,” IEEE Transactions on Image Processing, vol. 32, pp. 364–376, 2022.
  12. Y. Dai, Y. Wu, F. Zhou, and K. Barnard, “Asymmetric contextual modulation for infrared small target detection,” in IEEE/CVF Winter Conference on Applications of Computer Vision, 2021, pp. 950–959.
  13. B. Li, C. Xiao, L. Wang, Y. Wang, Z. Lin, M. Li, W. An, and Y. Guo, “Dense nested attention network for infrared small target detection,” IEEE Transactions on Image Processing, vol. 32, pp. 1745–1758, 2022.
  14. M. Zhang, R. Zhang, Y. Yang, H. Bai, J. Zhang, and J. Guo, “Isnet: Shape matters for infrared small target detection,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 877–886.
  15. J. Yu, Y. Jiang, Z. Wang, Z. Cao, and T. Huang, “Unitbox: An advanced object detection network,” in ACM International Conference on Multimedia, 2016, pp. 516–520.
  16. H. Li, J. Yang, Y. Xv, and R. Wang, “Ilnet: Low-level matters for salient infrared small target detection,” ArXiv, p. abs/2309.13646, 2023.
  17. M. Vollmer, “Infrared thermal imaging,” in Computer Vision: A Reference Guide, 2021, pp. 666–670.
  18. H. Jian-Jiang, L. Zhao-Hui, and L. Wen, “Noise analysis of infrared image and muti-dim-small target’s enhancement,” Infrared Technology, 2005.
  19. M. Malfait and D. Roose, “Wavelet-based image denoising using a markov random field a priori model,” IEEE Transactions on image processing, vol. 6, no. 4, pp. 549–565, 1997.
  20. M. Jansen and A. Bultheel, “Multiple wavelet threshold estimation by generalized cross validation for images with correlated noise,” IEEE transactions on image processing, vol. 8, no. 7, pp. 947–953, 1999.
  21. V. Strela, P. N. Heller, G. Strang, P. Topiwala, and C. Heil, “The application of multiwavelet filterbanks to image processing,” IEEE Transactions on image processing, vol. 8, no. 4, pp. 548–563, 1999.
  22. N. Weyrich and G. T. Warhola, “Wavelet shrinkage and generalized cross validation for image denoising,” IEEE Transactions on Image Processing, vol. 7, no. 1, pp. 82–90, 1998.
  23. A. Haar, “Zur theorie der orthogonalen funktionensysteme,” Mathematische Annalen, vol. 69, no. 3, pp. 331–371, 1910.
  24. J. Li, P. Zhang, X. Wang, and S. Huang, “Infrared small-target detection algorithms: a survey,” Journal of Image and Graphics, vol. 25, no. 9, pp. 1739–1753, 2020.
  25. T. Zhang, S. Cao, T. Pu, and Z. Peng, “Agpcnet: Attention-guided pyramid context networks for infrared small target detection,” IEEE Trans. Aerosp. Electron. Syst., vol. 59, no. 4, pp. 4256–4261, 2023.
  26. J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilistic models,” Advances in neural information processing systems, vol. 33, pp. 6840–6851, 2020.
  27. I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial networks,” Communications of the ACM, vol. 63, no. 11, pp. 139–144, 2020.
  28. T. Karras, S. Laine, and T. Aila, “A style-based generator architecture for generative adversarial networks,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 4401–4410.
  29. D. P. Kingma and M. Welling, “Auto-encoding variational bayes,” ArXiv, p. abs/1312.6114, 2013.
  30. H. Kim and A. Mnih, “Disentangling by factorising,” in International Conference on Machine Learning, 2018, pp. 2649–2658.
  31. X. Liu, D. H. Park, S. Azadi, G. Zhang, A. Chopikyan, Y. Hu, H. Shi, A. Rohrbach, and T. Darrell, “More control for free! image synthesis with semantic diffusion guidance,” in IEEE/CVF Winter Conference on Applications of Computer Vision, 2023, pp. 289–299.
  32. A. Nichol, P. Dhariwal, A. Ramesh, P. Shyam, P. Mishkin, B. McGrew, I. Sutskever, and M. Chen, “Glide: Towards photorealistic image generation and editing with text-guided diffusion models,” ArXiv, p. abs/2112.10741, 2021.
  33. C. Saharia, W. Chan, S. Saxena, L. Li, J. Whang, E. L. Denton, K. Ghasemipour, R. Gontijo Lopes, B. Karagol Ayan, T. Salimans et al., “Photorealistic text-to-image diffusion models with deep language understanding,” Advances in Neural Information Processing Systems, vol. 35, pp. 36 479–36 494, 2022.
  34. Z. Feng, Z. Zhang, X. Yu, Y. Fang, L. Li, X. Chen, Y. Lu, J. Liu, W. Yin, S. Feng et al., “Ernie-vilg 2.0: Improving text-to-image diffusion model with knowledge-enhanced mixture-of-denoising-experts,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 10 135–10 145.
  35. R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High-resolution image synthesis with latent diffusion models,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 10 684–10 695.
  36. A. Lugmayr, M. Danelljan, A. Romero, F. Yu, R. Timofte, and L. Van Gool, “Repaint: Inpainting using denoising diffusion probabilistic models,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 11 461–11 471.
  37. M. Zhao, F. Bao, C. Li, and J. Zhu, “Egsde: Unpaired image-to-image translation via energy-guided stochastic differential equations,” Advances in Neural Information Processing Systems, vol. 35, pp. 3609–3623, 2022.
  38. C. Saharia, J. Ho, W. Chan, T. Salimans, D. J. Fleet, and M. Norouzi, “Image super-resolution via iterative refinement,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 4, pp. 4713–4726, 2022.
  39. W. Xiang, H. Yang, D. Huang, and Y. Wang, “Denoising diffusion autoencoders are unified self-supervised learners,” ArXiv, p. abs/2303.09769, 2023.
  40. D. Baranchuk, I. Rubachev, A. Voynov, V. Khrulkov, and A. Babenko, “Label-efficient semantic segmentation with diffusion models,” ArXiv, p. abs/2112.03126, 2021.
  41. T. Amit, T. Shaharbany, E. Nachmani, and L. Wolf, “Segdiff: Image segmentation with diffusion probabilistic models,” ArXiv, p. abs/2112.00390, 2021.
  42. J. Wu, H. Fang, Y. Zhang, Y. Yang, and Y. Xu, “Medsegdiff: Medical image segmentation with diffusion probabilistic model,” ArXiv, p. abs/2211.00611, 2022.
  43. J. Wu, R. Fu, H. Fang, Y. Zhang, and Y. Xu, “Medsegdiff-v2: Diffusion based medical image segmentation with transformer,” ArXiv, p. abs/2301.11798, 2023.
  44. C. Ma, Y. Yang, C. Ju, F. Zhang, J. Liu, Y. Wang, Y. Zhang, and Y. Wang, “Diffusionseg: Adapting diffusion towards unsupervised object discovery,” ArXiv, p. abs/2303.09813, 2023.
  45. J. Chen, J. Lu, X. Zhu, and L. Zhang, “Generative semantic segmentation,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 7111–7120.
  46. M.-Q. Le, T. V. Nguyen, T.-N. Le, T.-T. Do, M. N. Do, and M.-T. Tran, “Maskdiff: Modeling mask distribution with diffusion probabilistic model for few-shot instance segmentation,” ArXiv, p. abs/2303.05105, 2023.
  47. O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in International Conference on Medical Image Computing and Computer-Assisted Intervention, 2015, pp. 234–241.
  48. A. Q. Nichol and P. Dhariwal, “Improved denoising diffusion probabilistic models,” in International Conference on Machine Learning, 2021, pp. 8162–8171.
  49. X. Qin, Z. Zhang, C. Huang, M. Dehghan, O. R. Zaiane, and M. Jagersand, “U2-net: Going deeper with nested u-structure for salient object detection,” Pattern Recognition, vol. 106, p. 107404, 2020.
  50. Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo, “Swin transformer: Hierarchical vision transformer using shifted windows,” in IEEE/CVF International Conference on Computer Vision, 2021, pp. 10 012–10 022.
  51. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” in Neural Information Processing Systems, 2017, pp. 6000–6010.
  52. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778.
  53. K. Zhang, W. Zuo, Y. Chen, D. Meng, and L. Zhang, “Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising,” IEEE transactions on image processing, vol. 26, no. 7, pp. 3142–3155, 2017.
  54. C. Gao, D. Meng, Y. Yang, Y. Wang, X. Zhou, and A. G. Hauptmann, “Infrared patch-image model for small target detection in a single image,” IEEE transactions on image processing, vol. 22, no. 12, pp. 4996–5009, 2013.
  55. A. Rahman, J. M. J. Valanarasu, I. Hacihaliloglu, and V. M. Patel, “Ambiguous medical image segmentation using diffusion models,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 11 536–11 546.
  56. I. Loshchilov and F. Hutter, “Decoupled weight decay regularization,” in International Conference on Learning Representations, 2018.
  57. S. K. Warfield, K. H. Zou, and W. M. Wells, “Simultaneous truth and performance level estimation (staple): an algorithm for the validation of image segmentation,” IEEE transactions on medical imaging, vol. 23, no. 7, pp. 903–921, 2004.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com