Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

The revenge of BiSeNet: Efficient Multi-Task Image Segmentation (2404.09570v1)

Published 15 Apr 2024 in cs.CV

Abstract: Recent advancements in image segmentation have focused on enhancing the efficiency of the models to meet the demands of real-time applications, especially on edge devices. However, existing research has primarily concentrated on single-task settings, especially on semantic segmentation, leading to redundant efforts and specialized architectures for different tasks. To address this limitation, we propose a novel architecture for efficient multi-task image segmentation, capable of handling various segmentation tasks without sacrificing efficiency or accuracy. We introduce BiSeNetFormer, that leverages the efficiency of two-stream semantic segmentation architectures and it extends them into a mask classification framework. Our approach maintains the efficient spatial and context paths to capture detailed and semantic information, respectively, while leveraging an efficient transformed-based segmentation head that computes the binary masks and class probabilities. By seamlessly supporting multiple tasks, namely semantic and panoptic segmentation, BiSeNetFormer offers a versatile solution for multi-task segmentation. We evaluate our approach on popular datasets, Cityscapes and ADE20K, demonstrating impressive inference speeds while maintaining competitive accuracy compared to state-of-the-art architectures. Our results indicate that BiSeNetFormer represents a significant advancement towards fast, efficient, and multi-task segmentation networks, bridging the gap between model efficiency and task adaptability.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (30)
  1. End-to-end object detection with transformers. In ECCV, 2020.
  2. Pem: Prototype-based efficient maskformer for image segmentation. CVPR, 2024.
  3. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE transactions on pattern analysis and machine intelligence, 40(4):834–848, 2017a.
  4. Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587, 2017b.
  5. Panoptic-deeplab: A simple, strong, and fast baseline for bottom-up panoptic segmentation. In CVPR, 2020.
  6. Per-pixel classification is not all you need for semantic segmentation. NeurIPS, 34:17864–17875, 2021.
  7. Masked-attention mask transformer for universal image segmentation. In CVPR, 2022.
  8. The cityscapes dataset for semantic urban scene understanding. In CVPR, 2016.
  9. Fast panoptic segmentation network. IEEE Robotics and Automation Letters, 5(2):1742–1749, 2020.
  10. Imagenet: A large-scale hierarchical image database. In CVPR, 2009.
  11. The pascal visual object classes challenge: A retrospective. International journal of computer vision, 111:98–136, 2015.
  12. Rethinking bisenet for real-time semantic segmentation. In CVPR, 2021.
  13. Lpsnet: A lightweight solution for fast panoptic segmentation. In CVPR, 2021a.
  14. Deep dual-resolution networks for real-time and accurate semantic segmentation of road scenes. arXiv preprint arXiv:2101.06085, 2021b.
  15. Real-time panoptic segmentation from dense detections. In CVPR, 2020.
  16. Squeeze-and-excitation networks. In CVPR, 2018.
  17. You only segment once: Towards real-time panoptic segmentation. In CVPR, 2023.
  18. Panoptic segmentation. In CVPR, 2019.
  19. Decoupled weight decay regularization. In ICLR, 2018.
  20. V-net: Fully convolutional neural networks for volumetric medical image segmentation. In 3DV, 2016.
  21. Pp-liteseg: A superior real-time semantic segmentation model. arXiv preprint arXiv:2204.02681, 2022.
  22. Solo: Segmenting objects by locations. In ECCV, 2020.
  23. Bidirectional graph reasoning network for panoptic segmentation. In CVPR, 2020.
  24. Segformer: Simple and efficient design for semantic segmentation with transformers. Advances in neural information processing systems, 34:12077–12090, 2021.
  25. Upsnet: A unified panoptic segmentation network. In CVPR, 2019.
  26. Pidnet: A real-time semantic segmentation network inspired by pid controllers. In CVPR, 2023.
  27. Bisenet: Bilateral segmentation network for real-time semantic segmentation. In ECCV, 2018.
  28. Bisenet v2: Bilateral network with guided aggregation for real-time semantic segmentation. International Journal of Computer Vision, 129:3051–3068, 2021.
  29. kmax-deeplab: k-means mask transformer. In ECCV, 2022.
  30. Scene parsing through ade20k dataset. In CVPR, 2017.
Citations (1)

Summary

We haven't generated a summary for this paper yet.