Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

LF Tracy: A Unified Single-Pipeline Approach for Salient Object Detection in Light Field Cameras (2401.16712v2)

Published 30 Jan 2024 in cs.CV, cs.RO, and eess.IV

Abstract: Leveraging rich information is crucial for dense prediction tasks. Light field (LF) cameras are instrumental in this regard, as they allow data to be sampled from various perspectives. This capability provides valuable spatial, depth, and angular information, enhancing scene-parsing tasks. However, we have identified two overlooked issues for the LF salient object detection (SOD) task. (1): Previous approaches predominantly employ a customized two-stream design to discover the spatial and depth features within light field images. The network struggles to learn the implicit angular information between different images due to a lack of intra-network data connectivity. (2): Little research has been directed towards the data augmentation strategy for LF SOD. Research on inter-network data connectivity is scant. In this study, we propose an efficient paradigm (LF Tracy) to address those issues. This comprises a single-pipeline encoder paired with a highly efficient information aggregation (IA) module (around 8M parameters) to establish an intra-network connection. Then, a simple yet effective data augmentation strategy called MixLD is designed to bridge the inter-network connections. Owing to this innovative paradigm, our model surpasses the existing state-of-the-art method through extensive experiments. Especially, LF Tracy demonstrates a 23% improvement over previous results on the latest large-scale PKU dataset. The source code is publicly available at: https://github.com/FeiBryantkit/LF-Tracy.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (50)
  1. R. B. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” in Proc. CVPR, 2013.
  2. W. Wang, J. Shen, and F. Porikli, “Saliency-aware geodesic video object segmentation,” in Proc. CVPR, 2015.
  3. A. Borji and L. Itti, “State-of-the-art in visual attention modeling,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013.
  4. Y. Pang, X. Zhao, L. Zhang, and H. Lu, “Multi-scale interactive network for salient object detection,” in Proc. CVPR, 2020.
  5. Z. Chen, Q. Xu, R. Cong, and Q. Huang, “Global context-aware progressive aggregation network for salient object detection,” in Proc. AAAI, 2020.
  6. F. Sun, P. Ren, B. Yin, F. Wang, and H. Li, “CATNet: A cascaded and aggregated transformer network for RGB-D salient object detection,” IEEE Transactions on Multimedia, 2023.
  7. G. Chen et al., “Modality-induced transfer-fusion network for RGB-D and RGB-T salient object detection,” IEEE Transactions on Circuits and Systems for Video Technology, 2022.
  8. T. Georgiev and C. Intwala, “Light field camera design for integral view photography,” Adobe System, Inc., Technical Report, 2006.
  9. Y. Chen, G. Li, P. An, Z. Liu, X. Huang, and Q. Wu, “Light field salient object detection with sparse views via complementary and discriminative interaction network,” IEEE Transactions on Circuits and Systems for Video Technology, 2023.
  10. G. Chen et al., “Fusion-embedding siamese network for light field salient object detection,” IEEE Transactions on Multimedia, 2023.
  11. Z. Xiao, Y. Liu, R. Gao, and Z. Xiong, “CutMIB: Boosting light field super-resolution via multi-view image blending,” in Proc. CVPR, 2023.
  12. D. Jing, S. Zhang, R. Cong, and Y. Lin, “Occlusion-aware bi-directional guided network for light field salient object detection,” in Proc. MM, 2021.
  13. N. Li, J. Ye, Y. Ji, H. Ling, and J. Yu, “Saliency detection on light field,” in Proc. CVPR, 2014.
  14. M. Zhang, J. Li, J. Wei, Y. Piao, and H. Lu, “Memory-oriented decoder for light field salient object detection,” in Proc. NeurIPS, 2019.
  15. J. Zhang, M. Wang, L. Lin, X. Yang, J. Gao, and Y. Rui, “Saliency detection on light field: A multi-cue approach,” ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 2017.
  16. W. Gao, S. Fan, G. Li, and W. Lin, “A thorough benchmark and a new model for light field saliency detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.
  17. M. Levoy and P. Hanrahan, “Light field rendering,” in Seminal Graphics Papers: Pushing the Boundaries, 2023.
  18. T. DeVries and G. W. Taylor, “Improved regularization of convolutional neural networks with cutout,” arXiv preprint arXiv:1708.04552, 2017.
  19. Z. Zhong, L. Zheng, G. Kang, S. Li, and Y. Yang, “Random erasing data augmentation,” in Proc. AAAI, 2020.
  20. B. Ekin, V. Dandelion, and Q. V. Le, “AutoAugment: Learning augmentation policies from data,” in Proc. CVPR, 2019.
  21. C. Zhang, X. Li, Z. Zhang, J. Cui, and B. Yang, “BO-Aug: learning data augmentation policies via bayesian optimization,” Applied Intelligence, 2023.
  22. C. Florea, C. Vertan, and L. Florea, “SoftClusterMix: Learning soft boundaries for empirical risk minimization,” Neural Computing and Applications, 2023.
  23. L. Itti, C. Koch, and E. Niebur, “A model of saliency-based visual attention for rapid scene analysis,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998.
  24. R. Achanta, F. Estrada, P. Wils, and S. Süsstrunk, “Salient region detection and segmentation,” in Proc. ICVS, 2008.
  25. Y. Wang, R. Wang, X. Fan, T. Wang, and X. He, “Pixels, regions, and objects: Multiple enhancement for salient object detection,” in Proc. CVPR, 2023.
  26. X. Deng, P. Zhang, W. Liu, and H. Lu, “Recurrent multi-scale transformer for high-resolution salient object detection,” in Proc. MM, 2023.
  27. R. Cong, K. Huang, J. Lei, Y. Zhao, Q. Huang, and S. Kwong, “Multi-projection fusion and refinement network for salient object detection in 360° omnidirectional image,” IEEE Transactions on Neural Networks and Learning Systems, 2023.
  28. R. Cong et al., “Does thermal really always matter for RGB-T salient object detection?” IEEE Transactions on Multimedia, 2023.
  29. T. Wang, Y. Piao, X. Li, and H. Lu, “Deep learning for light field saliency detection,” in Proc. ICCV, 2019.
  30. J. Zhang et al., “Delivering arbitrary-modal semantic segmentation,” in Proc. CVPR, 2023.
  31. D.-P. Fan, M.-M. Cheng, Y. Liu, T. Li, and A. Borji, “Structure-measure: A new way to evaluate foreground maps,” in Proc. ICCV, 2017.
  32. J. Wei, S. Wang, and Q. Huang, “F33{}^{3}start_FLOATSUPERSCRIPT 3 end_FLOATSUPERSCRIPTNet: Fusion, feedback and focus for salient object detection,” in Proc. AAAI, 2020.
  33. S. Salehi, D. Erdogmus, and A. Gholipour, “Tversky loss function for image segmentation using 3D fully convolutional deep networks,” in Proc. MLMI@MICCAI, 2017.
  34. Y. Piao, Y. Jiang, M. Zhang, J. Wang, and H. Lu, “PANet: Patch-aware network for light field salient object detection,” IEEE Transactions on Cybernetics, 2023.
  35. N. Li, B. Sun, and J. Yu, “A weighted sparse coding framework for saliency detection,” in Proc. CVPR, 2015.
  36. J. Zhang, M. Wang, J. Gao, Y. Wang, X. Zhang, and X. Wu, “Saliency detection with a deeper investigation of light field.” in Proc. IJCAI, 2015.
  37. Y. Piao, Z. Rong, M. Zhang, and H. Lu, “Exploit and replace: An asymmetrical two-stream architecture for versatile light field saliency detection,” in Proc. AAAI, 2020.
  38. D.-P. Fan, Y. Zhai, A. Borji, J. Yang, and L. Shao, “BBS-Net: RGB-D salient object detection with a bifurcated backbone strategy network,” in Proc. ECCV, 2020.
  39. K. Fu, D.-P. Fan, G.-P. Ji, Q. Zhao, J. Shen, and C. Zhu, “Siamese network for RGB-D salient object detection and beyond,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022.
  40. M. Zhang, W. Ren, Y. Piao, Z. Rong, and H. Lu, “Select, supplement and focus for RGB-D saliency detection,” in Proc. CVPR, 2020.
  41. J. Zhang et al., “Uncertainty inspired RGB-D saliency detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021.
  42. D.-P. Fan, Z. Lin, Z. Zhang, M. Zhu, and M.-M. Cheng, “Rethinking RGB-D salient object detection: Models, data sets, and large-scale benchmarks,” IEEE Transactions on Neural Networks and Learning Systems, 2021.
  43. N. Liu, N. Zhang, and J. Han, “Learning selective self-mutual attention for RGB-D saliency detection,” in Proc. CVPR, 2020.
  44. C. Li, R. Cong, Y. Piao, Q. Xu, and C. C. Loy, “RGB-D salient object detection with cross-modality modulation and selection,” in Proc. ECCV, 2020.
  45. Y. Pang, L. Zhang, X. Zhao, and H. Lu, “Hierarchical dynamic filtering network for RGB-D salient object detection,” in Proc. ECCV, 2020.
  46. M. Zhang, S. X. Fei, J. Liu, S. Xu, Y. Piao, and H. Lu, “Asymmetric two-stream architecture for accurate RGB-D saliency detection,” in Proc. ECCV, 2020.
  47. W. Wang et al., “PVT v2: Improved baselines with pyramid vision transformer,” Computational Visual Media, 2022.
  48. F. Perazzi, P. Krähenbühl, Y. Pritch, and A. Hornung, “Saliency filters: Contrast based filtering for salient region detection,” in Proc. CVPR, 2012.
  49. R. Achanta, S. Hemami, F. Estrada, and S. Susstrunk, “Frequency-tuned salient region detection,” in Proc. CVPR, 2009.
  50. Y. Yuan, X. Weng, Y. Ou, and K. M. Kitani, “AgentFormer: Agent-aware transformers for socio-temporal multi-agent forecasting,” in Proc. ICCV, 2021.

Summary

We haven't generated a summary for this paper yet.

Github Logo Streamline Icon: https://streamlinehq.com
X Twitter Logo Streamline Icon: https://streamlinehq.com