Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

HPL-ESS: Hybrid Pseudo-Labeling for Unsupervised Event-based Semantic Segmentation (2403.16788v1)

Published 25 Mar 2024 in cs.CV

Abstract: Event-based semantic segmentation has gained popularity due to its capability to deal with scenarios under high-speed motion and extreme lighting conditions, which cannot be addressed by conventional RGB cameras. Since it is hard to annotate event data, previous approaches rely on event-to-image reconstruction to obtain pseudo labels for training. However, this will inevitably introduce noise, and learning from noisy pseudo labels, especially when generated from a single source, may reinforce the errors. This drawback is also called confirmation bias in pseudo-labeling. In this paper, we propose a novel hybrid pseudo-labeling framework for unsupervised event-based semantic segmentation, HPL-ESS, to alleviate the influence of noisy pseudo labels. In particular, we first employ a plain unsupervised domain adaptation framework as our baseline, which can generate a set of pseudo labels through self-training. Then, we incorporate offline event-to-image reconstruction into the framework, and obtain another set of pseudo labels by predicting segmentation maps on the reconstructed images. A noisy label learning strategy is designed to mix the two sets of pseudo labels and enhance the quality. Moreover, we propose a soft prototypical alignment module to further improve the consistency of target domain features. Extensive experiments show that our proposed method outperforms existing state-of-the-art methods by a large margin on the DSEC-Semantic dataset (+5.88% accuracy, +10.32% mIoU), which even surpasses several supervised methods.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (38)
  1. Ev-segnet: Semantic segmentation for event-based cameras. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pages 0–0, 2019.
  2. Pseudo-labeling and confirmation bias in deep semi-supervised learning. In 2020 International Joint Conference on Neural Networks, IJCNN 2020, Glasgow, United Kingdom, July 19-24, 2020, pages 1–8, 2020.
  3. Ddd17: End-to-end davis driving dataset. arXiv preprint arXiv:1711.01458, 2017.
  4. Halsie–hybrid approach to learning segmentation by simultaneously exploiting image and event modalities. arXiv preprint arXiv:2211.10754, 2022.
  5. Exploiting domain-specific features to enhance domain generalization. Advances in Neural Information Processing Systems, 34:21189–21201, 2021.
  6. The cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3213–3223, 2016.
  7. Generalized jensen-shannon divergence loss for learning with noisy labels. Advances in Neural Information Processing Systems, 34:30284–30297, 2021.
  8. Event-based, 6-dof camera tracking from photometric depth maps. IEEE transactions on pattern analysis and machine intelligence, 40(10):2402–2412, 2017.
  9. Domain-adversarial training of neural networks. The journal of machine learning research, 17(1):2096–2030, 2016.
  10. Video to events: Recycling video datasets for event cameras. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3586–3595, 2020.
  11. Dsec: A stereo event camera dataset for driving scenarios. IEEE Robotics and Automation Letters, 6(3):4947–4954, 2021.
  12. Connecting the dots with landmarks: Discriminatively learning domain-invariant features for unsupervised domain adaptation. In International conference on machine learning, pages 222–230. PMLR, 2013.
  13. Unsupervised domain adaptation with label and structural consistency. IEEE Transactions on Image Processing, 25(12):5552–5562, 2016.
  14. Daformer: Improving network architectures and training strategies for domain-adaptive semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9924–9935, 2022.
  15. Towards a more efficient few-shot learning-based human gesture recognition via dynamic vision sensors. In BMVC, page 938, 2022.
  16. X4d-sceneformer: Enhanced scene understanding on 4d point cloud videos through cross-modal knowledge transfer. arXiv preprint arXiv:2312.07378, 2023.
  17. Attribute-aligned domain-invariant feature learning for unsupervised domain adaptation person re-identification. IEEE Transactions on Information Forensics and Security, 16:1480–1494, 2020.
  18. Targan: Generating target data with class labels for unsupervised domain adaptation. Knowledge-Based Systems, 172:123–129, 2019.
  19. Event-based vision meets deep learning on steering prediction for self-driving cars. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 5419–5427, 2018.
  20. Bridging the gap between events and frames through unsupervised domain adaptation. IEEE Robotics and Automation Letters, 7(2):3515–3522, 2022.
  21. Fixbi: Bridging domain spaces for unsupervised domain adaptation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1094–1103, 2021.
  22. Classmix: Segmentation-based data augmentation for semi-supervised learning. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 1369–1378, 2021.
  23. Hfirst: A temporal approach to object recognition. IEEE transactions on pattern analysis and machine intelligence, 37(10):2028–2040, 2015.
  24. Transferrable prototypical networks for unsupervised domain adaptation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2239–2247, 2019.
  25. High speed and high dynamic range video with an event camera. IEEE transactions on pattern analysis and machine intelligence, 43(6):1964–1980, 2019.
  26. High speed and high dynamic range video with an event camera. IEEE Transactions on Pattern Analysis and Machine Intelligence, page 1964–1980, 2021.
  27. Ess: Learning event-based semantic segmentation from still images. In European Conference on Computer Vision, pages 341–357. Springer, 2022.
  28. Hierarchical multi-scale attention for semantic segmentation. arXiv preprint arXiv:2005.10821, 2020.
  29. Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. Advances in neural information processing systems, 30, 2017.
  30. Understanding gradual domain adaptation: Improved analysis, optimal path and beyond. In International Conference on Machine Learning, pages 22784–22801. PMLR, 2022.
  31. Dual transfer learning for event-based end-task prediction via pluggable event to image translation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 2135–2145, 2021a.
  32. Evdistill: Asynchronous events to end-task learning via bidirectional reconstruction-guided cross-modal knowledge distillation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 608–619, 2021b.
  33. Sepico: Semantic-guided pixel contrast for domain adaptive semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell., 45(7):9004–9021, 2023.
  34. Segformer: Simple and efficient design for semantic segmentation with transformers. Advances in Neural Information Processing Systems, 34:12077–12090, 2021.
  35. Semi-supervised domain adaptation with source label adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 24100–24109, 2023.
  36. Prototypical pseudo label denoising and target structure learning for domain adaptive semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12414–12424, 2021.
  37. Unsupervised event-based learning of optical flow, depth, and egomotion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 989–997, 2019.
  38. Unsupervised domain adaptation for semantic segmentation via class-balanced self-training. In Proceedings of the European conference on computer vision (ECCV), pages 289–305, 2018.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (10)
  1. Linglin Jing (5 papers)
  2. Yiming Ding (20 papers)
  3. Yunpeng Gao (4 papers)
  4. Zhigang Wang (107 papers)
  5. Xu Yan (130 papers)
  6. Dong Wang (628 papers)
  7. Gerald Schaefer (16 papers)
  8. Hui Fang (48 papers)
  9. Bin Zhao (107 papers)
  10. Xuelong Li (268 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.