Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

CIR-Net: Cross-modality Interaction and Refinement for RGB-D Salient Object Detection (2210.02843v1)

Published 6 Oct 2022 in cs.CV

Abstract: Focusing on the issue of how to effectively capture and utilize cross-modality information in RGB-D salient object detection (SOD) task, we present a convolutional neural network (CNN) model, named CIR-Net, based on the novel cross-modality interaction and refinement. For the cross-modality interaction, 1) a progressive attention guided integration unit is proposed to sufficiently integrate RGB-D feature representations in the encoder stage, and 2) a convergence aggregation structure is proposed, which flows the RGB and depth decoding features into the corresponding RGB-D decoding streams via an importance gated fusion unit in the decoder stage. For the cross-modality refinement, we insert a refinement middleware structure between the encoder and the decoder, in which the RGB, depth, and RGB-D encoder features are further refined by successively using a self-modality attention refinement unit and a cross-modality weighting refinement unit. At last, with the gradually refined features, we predict the saliency map in the decoder stage. Extensive experiments on six popular RGB-D SOD benchmarks demonstrate that our network outperforms the state-of-the-art saliency detectors both qualitatively and quantitatively.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Runmin Cong (59 papers)
  2. Qinwei Lin (7 papers)
  3. Chen Zhang (403 papers)
  4. Chongyi Li (88 papers)
  5. Xiaochun Cao (177 papers)
  6. Qingming Huang (168 papers)
  7. Yao Zhao (272 papers)
Citations (100)

Summary

We haven't generated a summary for this paper yet.