Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SemSegDepth: A Combined Model for Semantic Segmentation and Depth Completion (2209.00381v2)

Published 1 Sep 2022 in cs.CV, cs.AI, and cs.LG

Abstract: Holistic scene understanding is pivotal for the performance of autonomous machines. In this paper we propose a new end-to-end model for performing semantic segmentation and depth completion jointly. The vast majority of recent approaches have developed semantic segmentation and depth completion as independent tasks. Our approach relies on RGB and sparse depth as inputs to our model and produces a dense depth map and the corresponding semantic segmentation image. It consists of a feature extractor, a depth completion branch, a semantic segmentation branch and a joint branch which further processes semantic and depth information altogether. The experiments done on Virtual KITTI 2 dataset, demonstrate and provide further evidence, that combining both tasks, semantic segmentation and depth completion, in a multi-task network can effectively improve the performance of each task. Code is available at https://github.com/juanb09111/semantic depth.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (44)
  1. Virtual kitti 2.
  2. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. CoRR, abs/1606.00915.
  3. Encoder-decoder with atrous separable convolution for semantic image segmentation. CoRR, abs/1802.02611.
  4. Searching for efficient multi-scale architectures for dense image prediction.
  5. Learning joint 2d-3d representations for depth completion.
  6. Panoptic-deeplab: A simple, strong, and fast baseline for bottom-up panoptic segmentation.
  7. Chollet, F. (2016). Xception: Deep learning with depthwise separable convolutions. CoRR, abs/1610.02357.
  8. Predicting distributions with linearizing belief networks. CoRR, abs/1511.05622.
  9. Language modeling with gated convolutional networks. CoRR, abs/1612.08083.
  10. Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. CoRR, abs/1411.4734.
  11. Digging into self-supervised monocular depth estimation.
  12. 3d packing for self-supervised monocular depth estimation.
  13. Fusenet: Incorporating depth into semantic segmentation via fusion-based cnn architecture. In Asian Conference on Computer Vision (ACCV).
  14. Deep residual learning for image recognition.
  15. Sosd-net: Joint semantic object segmentation and depth estimation from monocular images. CoRR, abs/2101.07422.
  16. Hms-net: Hierarchical multi-scale sparsity-invariant network for sparse depth completion.
  17. Depth coefficients for depth completion.
  18. Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. CoRR, abs/1705.07115.
  19. Video panoptic segmentation.
  20. Convolutional scale invariance for semantic segmentation. In GCPR.
  21. Auxiliary tasks in multi-task learning.
  22. Refinenet: Multi-path refinement networks for high-resolution semantic segmentation.
  23. Feature pyramid networks for object detection.
  24. Fully convolutional networks for semantic segmentation. CoRR, abs/1411.4038.
  25. Self-supervised sparse-to-dense: Self-supervised depth completion from lidar and monocular camera.
  26. Efficientps: Efficient panoptic segmentation. CoRR, abs/2004.02307.
  27. Deeplidar: Deep surface normal guided depth prediction for outdoor scene from sparse lidar data and single color image. CoRR, abs/1812.00488.
  28. Yolov3: An incremental improvement. CoRR, abs/1804.02767.
  29. Faster r-cnn: Towards real-time object detection with region proposal networks.
  30. U-net: Convolutional networks for biomedical image segmentation. CoRR, abs/1505.04597.
  31. Hybridnet for depth estimation and semantic segmentation. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1563–1567.
  32. Gated-scnn: Gated shape cnns for semantic segmentation. CoRR, abs/1907.05740.
  33. Efficientnet: Rethinking model scaling for convolutional neural networks. CoRR, abs/1905.11946.
  34. Learning guided convolutional network for depth completion.
  35. Pixel consensus voting for panoptic segmentation.
  36. Deep parametric continuous convolutional neural networks. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
  37. Single-shot panoptic segmentation.
  38. Depth completion from sparse lidar data with depth-normal constraints.
  39. Dense depth posterior (DDP) from single image and sparse range. CoRR, abs/1901.10034.
  40. Describing the scene as a whole: Joint object detection, scene classification and semantic segmentation. In 2012 IEEE Conference on Computer Vision and Pattern Recognition, pages 702–709.
  41. Taskonomy: Disentangling task transfer learning. CoRR, abs/1804.08328.
  42. Feature selective networks for object detection.
  43. Deep depth completion of a single RGB-D image. CoRR, abs/1803.09326.
  44. Simultaneous semantic segmentation and depth completion with constraint of boundary. Sensors, 20(3).
Citations (4)

Summary

We haven't generated a summary for this paper yet.