Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Learning Hierarchical Color Guidance for Depth Map Super-Resolution (2403.07290v2)

Published 12 Mar 2024 in cs.CV

Abstract: Color information is the most commonly used prior knowledge for depth map super-resolution (DSR), which can provide high-frequency boundary guidance for detail restoration. However, its role and functionality in DSR have not been fully developed. In this paper, we rethink the utilization of color information and propose a hierarchical color guidance network to achieve DSR. On the one hand, the low-level detail embedding module is designed to supplement high-frequency color information of depth features in a residual mask manner at the low-level stages. On the other hand, the high-level abstract guidance module is proposed to maintain semantic consistency in the reconstruction process by using a semantic mask that encodes the global guidance information. The color information of these two dimensions plays a role in the front and back ends of the attention-based feature projection (AFP) module in a more comprehensive form. Simultaneously, the AFP module integrates the multi-scale content enhancement block and adaptive attention projection block to make full use of multi-scale information and adaptively project critical restoration information in an attention manner for DSR. Compared with the state-of-the-art methods on four benchmark datasets, our method achieves more competitive performance both qualitatively and quantitatively.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (71)
  1. H. Hu, T. Zhao, Q. Wang, F. Gao, L. He, and Z. Gao, “Monocular 3-D vehicle detection using a cascade network for autonomous driving,” IEEE Trans. Instrum. Meas., vol. 70, pp. 1–13, 2021.
  2. Z. Pan, J. Hou, and L. Yu, “Optimization RGB-D 3D reconstruction algorithm based on dynamic SLAM,” IEEE Trans. Instrum. Meas., vol. 72, pp. 1–13, 2023.
  3. C. Gu, Y. Cong, and G. Sun, “Three birds, One stone: Unified laser-based 3D reconstruction across different media,” IEEE Trans. Instrum. Meas., vol. 70, pp. 1–12, 2021.
  4. S. Liu, R. Wu, J. Qu, and Y. Li, “HDA-Net: Hybrid convolutional neural networks for small objects recognization at airports,” IEEE Trans. Instrum. Meas., vol. 71, pp. 1–14, 2022.
  5. A. Kosuge, S. Suehiro, M. Hamada, and T. Kuroda, “mmWave-YOLO: A mmwave imaging radar-based real-time multiclass object recognition system for ADAS applications,” IEEE Trans. Instrum. Meas., vol. 71, pp. 1–10, 2022.
  6. R. Cong, Q. Qin, C. Zhang, Q. Jiang, S. Wang, Y. Zhao, and S. Kwong, “A weakly supervised learning framework for salient object detection via hybrid labels,” IEEE Trans. Circuits Syst. Video Technol., vol. 33, no. 2, pp. 534–548, 2023.
  7. R. Cong, Q. Lin, C. Zhang, C. Li, X. Cao, Q. Huang, and Y. Zhao, “CIR-Net: Cross-modality interaction and refinement for RGB-D salient object detection,” IEEE Trans. Image Process., vol. 31, pp. 6800–6815, 2022.
  8. H. Wen, C. Yan, X. Zhou, R. Cong, Y. Sun, B. Zheng, J. Zhang, Y. Bao, and G. Ding, “Dynamic selective network for RGB-D salient object detection,” IEEE Trans. Image Process., vol. 30, pp. 9179–9192, 2021.
  9. C. Zhang, R. Cong, Q. Lin, L. Ma, F. Li, Y. Zhao, and S. Kwong, “Cross-modality discrepant interaction network for RGB-D salient object detection,” in Proc. ACM Int. Conf. Multim. (ACM MM), 2021, pp. 2094–2102.
  10. R. Cong, H. Liu, C. Zhang, W. Zhang, F. Zheng, R. Song, and S. Kwong, “Point-aware interaction and CNN-induced refinement network for RGB-D salient object detection,” in Proc. ACM Int. Conf. Multim. (ACM MM), 2023, pp. 406–416.
  11. Y. Mao, Q. Jiang, R. Cong, W. Gao, F. Shao, and S. Kwong, “Cross-modality fusion and progressive integration network for saliency prediction on stereoscopic 3D images,” IEEE Trans. Multimedia, vol. 24, pp. 2435–2448, 2022.
  12. R. Cong, Y. Zhang, L. Fang, J. Li, Y. Zhao, and S. Kwong, “RRNet: Relational reasoning network with parallel multiscale attention for salient object detection in optical remote sensing images,” IEEE Trans. Geosci. Remote. Sens., vol. 60, pp. 1–11, 2022.
  13. R. Cong, K. Zhang, C. Zhang, F. Zheng, Y. Zhao, Q. Huang, and S. Kwong, “Does Thermal really always matter for RGB-T salient object detection?” IEEE Trans. Multimedia, early access, doi: 10.1109/TMM.2022.3216476.
  14. C. Li, R. Cong, S. Kwong, J. Hou, H. Fu, G. Zhu, D. Zhang, and Q. Huang, “ASIF-Net: Attention steered interweave fusion network for RGB-D salient object detection,” IEEE Trans. Cybern., vol. 50, no. 1, pp. 88–100, 2021.
  15. C. Li, R. Cong, Y. Piao, Q. Xu, and C. C. Loy, “RGB-D salient object detection with cross-modality modulation and selection,” in Proc. European Conf. Comput. Vis. (ECCV), 2020, pp. 225–241.
  16. X. Song, Y. Dai, D. Zhou, L. Liu, W. Li, H. Li, and R. Yang, “Channel attention based iterative residual learning for depth map super-resolution,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2020, pp. 5630–5639.
  17. X. Song, D. Zhou, W. Li, Y. Dai, L. Liu, H. Li, R. Yang, and L. Zhang, “WAFP-Net: Weighted attention fusion based progressive residual learning for depth map super-resolution,” IEEE Trans. Multim., vol. 24, pp. 4113–4127, 2022.
  18. M. Ni, J. Lei, R. Cong, K. Zheng, B. Peng, and X. Fan, “Color-guided depth map super resolution using convolutional neural network,” IEEE Access, vol. 5, pp. 26 666–26 672, 2017.
  19. B. Kim, J. Ponce, and B. Ham, “Deformable kernel networks for joint image filtering,” Int. J. Comput. Vis., vol. 129, no. 2, pp. 579–600, 2021.
  20. P. Liu, Z. Zhang, Z. Meng, N. Gao, and C. Wang, “PDR-Net: Progressive depth reconstruction network for color guided depth map super-resolution,” Neurocomputing, vol. 479, pp. 75–88, 2022.
  21. Y. Zuo, Q. Wu, Y. Fang, P. An, L. Huang, and Z. Chen, “Multi-scale frequency reconstruction for guided depth map super-resolution via deep residual network,” IEEE Trans. Circuits Syst. Video Technol., vol. 30, no. 2, pp. 297–306, 2020.
  22. Q. Tang, R. Cong, R. Sheng, L. He, D. Zhang, Y. Zhao, and S. Kwong, “BridgeNet: A joint learning network of depth map super-resolution and monocular depth estimation,” in Proc. ACM Int. Conf. Multim. (ACM MM), 2021, pp. 2148–2157.
  23. L. Zhao, H. Bai, J. Liang, B. Zeng, A. Wang, and Y. Zhao, “Simultaneous color-depth super-resolution with conditional generative adversarial networks,” Pattern Recognit., vol. 88, pp. 356–369, 2019.
  24. X. Ye, B. Sun, Z. Wang, J. Yang, R. Xu, H. Li, and B. Li, “PMBANet: Progressive multi-branch aggregation network for scene depth super-resolution,” IEEE Trans. Image Process., vol. 29, pp. 7427–7442, 2020.
  25. C. Guo, C. Li, J. Guo, R. Cong, H. Fu, and P. Han, “Hierarchical features driven residual learning for depth map super-resolution,” IEEE Trans. Image Process., vol. 28, no. 5, pp. 2545–2557, 2019.
  26. X. Cao, Y. Luo, X. Zhu, L. Zhang, Y. Xu, H. Shen, T. Wang, and Q. Feng, “DAEANet: Dual auto-encoder attention network for depth map super-resolution,” Neurocomputing, vol. 454, pp. 350–360, 2021.
  27. D. Fan, Y. Zhai, A. Borji, J. Yang, and L. Shao, “BBS-Net: RGB-D salient object detection with a bifurcated backbone strategy network,” in Proc. Eur. Conf. Comput. Vis. (ECCV), 2020, pp. 275–292.
  28. Y. Wen, B. Sheng, P. Li, W. Lin, and D. D. Feng, “Deep color guided coarse-to-fine convolutional network cascade for depth image super-resolution,” IEEE Trans. Image Process., vol. 28, no. 2, pp. 994–1006, 2019.
  29. Q. Yang, R. Yang, J. Davis, and D. Nistér, “Spatial-depth super resolution for range images,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2007, pp. 1–8.
  30. M. Haris, G. Shakhnarovich, and N. Ukita, “Deep back-projection networks for super-resolution,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2018, pp. 1664–1673.
  31. W. Hsu and P. Jian, “Detail-enhanced wavelet residual network for single image super-resolution,” IEEE Trans. Instrum. Meas., vol. 71, pp. 1–13, 2022.
  32. S. Ahmadi, L. Kästner, J. C. Hauffen, P. Jung, and M. Ziegler, “Photothermal-SR-Net: A customized deep unfolding neural network for photothermal super resolution imaging,” IEEE Trans. Instrum. Meas., vol. 71, pp. 1–9, 2022.
  33. W. Shi, F. Tao, and Y. Wen, “Structure-aware deep networks and pixel-level generative adversarial training for single image super-resolution,” IEEE Trans. Instrum. Meas., vol. 72, pp. 1–14, 2023.
  34. Y. Zhu, S. Wang, Y. Zhang, Z. He, and Q. Wang, “A dual transformer super-resolution network for improving the definition of vibration image,” IEEE Trans. Instrum. Meas., vol. 72, pp. 1–12, 2023.
  35. J. Wu, R. Cong, L. Fang, C. Guo, B. Zhang, and P. Ghamisi, “Unpaired remote sensing image super-resolution with content-preserving weak supervision neural network,” Sci. China Inf. Sci., vol. 66, no. 1, 2023.
  36. F. Li, Y. Wu, H. Bai, W. Lin, R. Cong, and Y. Zhao, “Learning detail-structure alternative optimization for blind super-resolution,” IEEE Trans. Multim., vol. 25, pp. 2825–2838, 2023.
  37. G. Riegler, M. Rüther, and H. Bischof, “ATGV-Net: Accurate depth super-resolution,” in Proc. Eur. Conf. Comput. Vis. (ECCV), 2016, pp. 268–284.
  38. X. Song, Y. Dai, and X. Qin, “Deeply supervised depth map super-resolution as novel view synthesis,” IEEE Trans. Circuits Syst. Video Technol., vol. 29, no. 8, pp. 2323–2336, 2019.
  39. X. Ye, B. Sun, Z. Wang, J. Yang, R. Xu, H. Li, and B. Li, “Depth super-resolution via deep controllable slicing network,” in Proc. ACM Int. Conf. Multim. (ACM MM), 2020, pp. 1809–1818.
  40. J. Kopf, M. F. Cohen, D. Lischinski, and M. Uyttendaele, “Joint bilateral upsampling,” ACM Trans. Graph., vol. 26, no. 3, p. 96, 2007.
  41. K. He, J. Sun, and X. Tang, “Guided image filtering,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 35, no. 6, pp. 1397–1409, 2013.
  42. J. Wang, L. Sun, R. Xiong, Y. Shi, Q. Zhu, and B. Yin, “Depth map super-resolution based on dual normal-depth regularization and graph laplacian prior,” IEEE Trans. Circuits Syst. Video Technol., vol. 32, no. 6, pp. 3304–3318, 2022.
  43. L. Huang, J. Zhang, Y. Zuo, and Q. Wu, “Pyramid-structured depth MAP super-resolution based on deep dense-residual network,” IEEE Signal Process. Lett., vol. 26, no. 12, pp. 1723–1727, 2019.
  44. F. Yu, V. Koltun, and T. A. Funkhouser, “Dilated residual networks,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2017, pp. 636–644.
  45. Y. Huang, F. Zheng, R. Cong, W. Huang, M. R. Scott, and L. Shao, “MCMT-GAN: multi-task coherent modality transferable GAN for 3D brain image synthesis,” IEEE Trans. Image Process., vol. 29, pp. 8187–8198, 2020.
  46. S. Woo, J. Park, J. Lee, and I. S. Kweon, “CBAM: Convolutional block attention module,” in Proc. Eur. Conf. Comput. Vis. (ECCV), 2018, pp. 3–19.
  47. D. Ferstl, C. Reinbacher, R. Ranftl, M. Rüther, and H. Bischof, “Image guided depth upsampling using anisotropic total generalized variation,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), 2013, pp. 993–1000.
  48. J. Xie, R. S. Feris, and M. Sun, “Edge-guided single depth image super resolution,” IEEE Trans. Image Process., vol. 25, no. 1, pp. 428–438, 2016.
  49. M. Liu, O. Tuzel, and Y. Taguchi, “Joint geodesic upsampling of depth images,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2013, pp. 169–176.
  50. J. Xie, C. Chou, R. S. Feris, and M. Sun, “Single depth image super resolution and denoising via coupled dictionary learning with local constraints and shock filtering,” in Proc. IEEE Int. Conf. Multimedia Expo (ICME), 2014, pp. 1–6.
  51. R. de Lutio, S. D’Aronco, J. D. Wegner, and K. Schindler, “Guided super-resolution as pixel-to-pixel transformation,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), 2019, pp. 8828–8836.
  52. J. Wang, W. Xu, J. Cai, Q. Zhu, Y. Shi, and B. Yin, “Multi-direction dictionary learning based depth map super-resolution with autoregressive modeling,” IEEE Trans. Multim., vol. 22, no. 6, pp. 1470–1484, 2020.
  53. X. Ye, X. Duan, and H. Li, “Depth super-resolution with deep edge-inference network and edge-guided depth filling,” in Proc. IEEE Int. Conf. Acoust., Speech Signal Process. (ICASSP), 2018, pp. 1398–1402.
  54. Y. Li, J. Huang, N. Ahuja, and M. Yang, “Deep joint image filtering,” in Proc. Eur. Conf. Comput. Vis. (ECCV), 2016, pp. 154–169.
  55. B. Sun, X. Ye, B. Li, H. Li, Z. Wang, and R. Xu, “Learning scene structure guidance via cross-task knowledge transfer for single depth super-resolution,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2021, pp. 7792–7801.
  56. Y. Zuo, H. Wang, Y. Fang, X. Huang, X. Shang, and Q. Wu, “MIG-Net: Multi-scale network alternatively guided by intensity and gradient features for depth map super-resolution,” IEEE Trans. Multim., vol. 24, pp. 3506–3519, 2022.
  57. N. Silberman, D. Hoiem, P. Kohli, and R. Fergus, “Indoor segmentation and support inference from RGBD images,” in Proc. Eur. Conf. Comput. Vis. (ECCV), 2012, pp. 746–760.
  58. L. He, H. Zhu, F. Li, H. Bai, R. Cong, C. Zhang, C. Lin, M. Liu, and Y. Zhao, “Towards fast and accurate real-world depth super-resolution: Benchmark dataset and baseline,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2021, pp. 9229–9238.
  59. S. Lu, X. Ren, and F. Liu, “Depth enhancement via low-rank matrix completion,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2014, pp. 3390–3397.
  60. D. Scharstein and R. Szeliski, “A taxonomy and evaluation of dense two-frame stereo correspondence algorithms,” Int. J. Comput. Vis., vol. 47, no. 1-3, pp. 7–42, 2002.
  61. H. Hirschmüller and D. Scharstein, “Evaluation of cost functions for stereo matching,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2007, pp. 1–8.
  62. D. Scharstein, H. Hirschmüller, Y. Kitajima, G. Krathwohl, N. Nesic, X. Wang, and P. Westling, “High-resolution stereo datasets with subpixel-accurate ground truth,” in Proc. 36th German Conf. Pattern Recognit., 2014, pp. 31–42.
  63. D. Scharstein and C. Pal, “Learning conditional random fields for stereo,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2007, pp. 1–8.
  64. Y. Li, J. Huang, N. Ahuja, and M. Yang, “Joint image filtering with deep convolutional networks,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 41, no. 8, pp. 1909–1923, 2019.
  65. B. Ham, M. Cho, and J. Ponce, “Robust guided image filtering using nonconvex potentials,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 40, no. 1, pp. 192–207, 2018.
  66. J. Pan, J. Dong, J. S. J. Ren, L. Lin, J. Tang, and M. Yang, “Spatially variant linear representation models for joint filtering,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2019, pp. 1702–1711.
  67. J. Tang, X. Chen, and G. Zeng, “Joint implicit image function for guided depth super-resolution,” in Proc. ACM Int. Conf. Multim. (ACM MM), 2021, pp. 4390–4399.
  68. Z. Zhao, J. Zhang, S. Xu, Z. Lin, and H. Pfister, “Discrete cosine transform network for guided depth map super-resolution,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2022, pp. 5697–5707.
  69. C. Li, J. Guo, B. Wang, R. Cong, Y. Zhang, and J. Wang, “Single underwater image enhancement based on color cast removal and visibility restoration,” J. Electronic Imaging, vol. 25, no. 3, p. 033012, 2016.
  70. J. Hu, Q. Jiang, R. Cong, W. Gao, and F. Shao, “Two-branch deep neural network for underwater image enhancement in HSV color space,” IEEE Signal Process. Lett., vol. 28, pp. 2152–2156, 2021.
  71. Y. Zhang, Y. Tian, Y. Kong, B. Zhong, and Y. Fu, “Residual dense network for image super-resolution,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2018, pp. 2472–2481.
Citations (1)

Summary

We haven't generated a summary for this paper yet.