Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

PanDepth: Joint Panoptic Segmentation and Depth Completion (2212.14180v2)

Published 29 Dec 2022 in cs.CV

Abstract: Understanding 3D environments semantically is pivotal in autonomous driving applications where multiple computer vision tasks are involved. Multi-task models provide different types of outputs for a given scene, yielding a more holistic representation while keeping the computational cost low. We propose a multi-task model for panoptic segmentation and depth completion using RGB images and sparse depth maps. Our model successfully predicts fully dense depth maps and performs semantic segmentation, instance segmentation, and panoptic segmentation for every input frame. Extensive experiments were done on the Virtual KITTI 2 dataset and we demonstrate that our model solves multiple tasks, without a significant increase in computational cost, while keeping high accuracy performance. Code is available at https://github.com/juanb09111/PanDepth.git

Definition Search Book Streamline Icon: https://streamlinehq.com
References (53)
  1. Adelson, E. H. (2001). On seeing stuff: the perception of materials by humans and machines. In IS&T/SPIE Electronic Imaging.
  2. In-place activated batchnorm for memory-optimized training of dnns. CoRR, abs/1712.02616.
  3. Virtual kitti 2.
  4. End-to-end object detection with transformers. CoRR, abs/2005.12872.
  5. Scaling wide residual networks for panoptic segmentation. CoRR, abs/2011.11675.
  6. Learning joint 2d-3d representations for depth completion. CoRR, abs/2012.12402.
  7. Learning joint 2d-3d representations for depth completion.
  8. Panoptic-deeplab: A simple, strong, and fast baseline for bottom-up panoptic segmentation. CoRR, abs/1911.10194.
  9. Masked-attention mask transformer for universal image segmentation. CoRR, abs/2112.01527.
  10. Per-pixel classification is not all you need for semantic segmentation. CoRR, abs/2107.06278.
  11. Chollet, F. (2016). Xception: Deep learning with depthwise separable convolutions. CoRR, abs/1610.02357.
  12. Fast panoptic segmentation network. CoRR, abs/1910.03892.
  13. Confidence propagation through cnns for guided sparse depth regression. CoRR, abs/1811.01791.
  14. Sparse and noisy lidar completion with RGB guidance and uncertainty. CoRR, abs/1902.05356.
  15. Panopticdepth: A unified framework for depth-aware panoptic segmentation.
  16. Learning to branch for multi-task learning. CoRR, abs/2006.01895.
  17. Fusenet: Incorporating depth into semantic segmentation via fusion-based cnn architecture. In Asian Conference on Computer Vision (ACCV).
  18. Mask R-CNN. CoRR, abs/1703.06870.
  19. Sosd-net: Joint semantic object segmentation and depth estimation from monocular images. CoRR, abs/2101.07422.
  20. Real-time panoptic segmentation from dense detections. CoRR, abs/1912.01202.
  21. Penet: Towards precise and efficient image guided depth completion. CoRR, abs/2103.00783.
  22. Sparse and dense data with cnns: Depth completion and semantic segmentation. CoRR, abs/1808.00769.
  23. Panoptic feature pyramid networks. CoRR, abs/1901.02446.
  24. Panoptic segmentation. CoRR, abs/1801.00868.
  25. Semsegdepth: A combined model for semantic segmentation and depth completion. In Farinella, G. M., Radeva, P., and Bouatouch, K., editors, Proceedings of the 17th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, VISIGRAPP 2022, Volume 5: VISAPP, Online Streaming, February 6-8, 2022, pages 155–165. SCITEPRESS.
  26. Multi-task learning with attention : Constructing auxiliary tasks for learning to learn. In 2021 IEEE 33rd International Conference on Tools with Artificial Intelligence (ICTAI), pages 145–152.
  27. Attention-guided unified network for panoptic segmentation. CoRR, abs/1812.03904.
  28. Panoptic segformer. CoRR, abs/2109.03814.
  29. Multidepth: Single-image depth estimation via multi-task regression and classification. CoRR, abs/1907.11111.
  30. Auxiliary tasks in multi-task learning.
  31. Microsoft COCO: common objects in context. CoRR, abs/1405.0312.
  32. An end-to-end network for panoptic segmentation. CoRR, abs/1903.05027.
  33. End-to-end multi-task learning with attention. CoRR, abs/1803.10704.
  34. Fully convolutional networks for semantic segmentation. CoRR, abs/1411.4038.
  35. Efficientps: Efficient panoptic segmentation. CoRR, abs/2004.02307.
  36. Non-local spatial propagation network for depth completion. CoRR, abs/2007.10042.
  37. Multi-task network for panoptic segmentation in automated driving. In 2019 IEEE Intelligent Transportation Systems Conference (ITSC), pages 2394–2401.
  38. Deeplidar: Deep surface normal guided depth prediction for outdoor scene from sparse lidar data and single color image. CoRR, abs/1812.00488.
  39. U-net: Convolutional networks for biomedical image segmentation. CoRR, abs/1505.04597.
  40. Ruder, S. (2017). An overview of multi-task learning in deep neural networks. CoRR, abs/1706.05098.
  41. MGNet: Monocular geometric scene understanding for autonomous driving. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE.
  42. Multi-task learning as multi-objective optimization. CoRR, abs/1810.04650.
  43. Efficientnet: Rethinking model scaling for convolutional neural networks. CoRR, abs/1905.11946.
  44. Learning guided convolutional network for depth completion. CoRR, abs/1908.01238.
  45. Sparsity invariant cnns. CoRR, abs/1708.06500.
  46. Attention is all you need. CoRR, abs/1706.03762.
  47. Max-deeplab: End-to-end panoptic segmentation with mask transformers. CoRR, abs/2012.00759.
  48. Deep parametric continuous convolutional neural networks. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
  49. Upsnet: A unified panoptic segmentation network. CoRR, abs/1901.03784.
  50. Dense depth posterior (DDP) from single image and sparse range. CoRR, abs/1901.10034.
  51. Polyphonicformer: Unified query learning for depth-aware video panoptic segmentation. CoRR, abs/2112.02582.
  52. Deformable DETR: deformable transformers for end-to-end object detection. CoRR, abs/2010.04159.
  53. Simultaneous semantic segmentation and depth completion with constraint of boundary. Sensors, 20(3).
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Juan Lagos (2 papers)
  2. Esa Rahtu (78 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.