Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Latency-aware Road Anomaly Segmentation in Videos: A Photorealistic Dataset and New Metrics (2401.04942v1)

Published 10 Jan 2024 in cs.CV

Abstract: In the past several years, road anomaly segmentation is actively explored in the academia and drawing growing attention in the industry. The rationale behind is straightforward: if the autonomous car can brake before hitting an anomalous object, safety is promoted. However, this rationale naturally calls for a temporally informed setting while existing methods and benchmarks are designed in an unrealistic frame-wise manner. To bridge this gap, we contribute the first video anomaly segmentation dataset for autonomous driving. Since placing various anomalous objects on busy roads and annotating them in every frame are dangerous and expensive, we resort to synthetic data. To improve the relevance of this synthetic dataset to real-world applications, we train a generative adversarial network conditioned on rendering G-buffers for photorealism enhancement. Our dataset consists of 120,000 high-resolution frames at a 60 FPS framerate, as recorded in 7 different towns. As an initial benchmarking, we provide baselines using latest supervised and unsupervised road anomaly segmentation methods. Apart from conventional ones, we focus on two new metrics: temporal consistency and latencyaware streaming accuracy. We believe the latter is valuable as it measures whether an anomaly segmentation algorithm can truly prevent a car from crashing in a temporally informed setting.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (37)
  1. Fishyscapes: A benchmark for safe semantic segmentation in autonomous driving. In proceedings of the IEEE/CVF international conference on computer vision workshops, pages 0–0, 2019.
  2. nuscenes: A multimodal dataset for autonomous driving. arXiv preprint arXiv:1903.11027, 2019.
  3. Segmentmeifyoucan: A benchmark for anomaly segmentation. In Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks. Curran, 2021.
  4. Vision transformer adapter for dense predictions. arXiv preprint arXiv:2205.08534, 2022.
  5. The cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3213–3223, 2016.
  6. Real-time small obstacle detection on highways using compressive RBM road reconstruction. In 2015 IEEE Intelligent Vehicles Symposium (IV), pages 162–167. IEEE, 2015.
  7. CARLA: An open urban driving simulator. In Proceedings of the 1st Annual Conference on Robot Learning, pages 1–16, 2017.
  8. The pascal visual object classes (voc) challenge. International journal of computer vision, 88:303–338, 2010.
  9. Eva: Exploring the limits of masked visual representation learning at scale. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 19358–19369, 2023.
  10. Muad: Multiple uncertainties for autonomous driving benchmark for multiple uncertainty types and tasks. arXiv preprint arXiv:2203.01437, 2022.
  11. Virtual worlds as proxy for multi-object tracking analysis. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4340–4349, 2016.
  12. Densehybrid: Hybrid anomaly detection for dense open-set recognition. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXV, pages 500–517. Springer, 2022.
  13. Deep Anomaly Detection with Outlier Exposure. In International Conference on Learning Representations.
  14. Scaling out-of-distribution detection for real-world settings. In International Conference on Machine Learning, pages 8759–8773. PMLR, 2022.
  15. Standardized max logits: A simple yet effective approach for identifying unexpected road obstacles in urban-scene segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 15425–15434, 2021.
  16. What uncertainties do we need in bayesian deep learning for computer vision? Advances in neural information processing systems, 30, 2017.
  17. Simple and scalable predictive uncertainty estimation using deep ensembles. Advances in neural information processing systems, 30, 2017.
  18. Mseg: A composite dataset for multi-domain semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2879–2888, 2020.
  19. Training Confidence-calibrated Classifiers for Detecting Out-of-Distribution Samples. In International Conference on Learning Representations.
  20. Yolov6: A single-stage object detection framework for industrial applications. arXiv preprint arXiv:2209.02976, 2022a.
  21. Towards streaming perception. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part II 16, pages 473–488. Springer, 2020.
  22. Efficient multi-order gated aggregation network. arXiv preprint arXiv:2211.03295, 2022b.
  23. Gmmseg: Gaussian mixture based generative semantic segmentation models. arXiv preprint arXiv:2210.02025, 2022.
  24. Detecting the unexpected via image resynthesis. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 2152–2161, 2019.
  25. Residual pattern learning for pixel-wise out-of-distribution detection in semantic segmentation. arXiv preprint arXiv:2211.14512, 2022.
  26. Detection and retrieval of out-of-distribution objects in semantic segmentation. In Proceedings of the ieee/cvf conference on computer vision and pattern recognition workshops, pages 328–329, 2020.
  27. Lost and found: detecting small road hazards for self-driving vehicles. In 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 1099–1106. IEEE, 2016.
  28. Playing for data: Ground truth from computer games. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part II 14, pages 102–118. Springer, 2016.
  29. Playing for benchmarks. In Proceedings of the IEEE International Conference on Computer Vision, pages 2213–2222, 2017.
  30. Enhancing photorealism enhancement. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(2):1700–1715, 2022.
  31. The synthia dataset: A large collection of synthetic images for semantic segmentation of urban scenes. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3234–3243, 2016.
  32. Pixel-wise energy-biased abstention learning for anomaly segmentation on complex urban driving scenes. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXXIX, pages 246–263. Springer, 2022.
  33. Deep high-resolution representation learning for visual recognition. IEEE transactions on pattern analysis and machine intelligence, 43(10):3349–3364, 2020.
  34. One-peace: Exploring one general representation model toward unlimited modalities. arXiv preprint arXiv:2305.11172, 2023a.
  35. Internimage: Exploring large-scale vision foundation models with deformable convolutions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14408–14419, 2023b.
  36. Synthesize then Compare: Detecting Failures and Anomalies for Semantic Segmentation. In European Conference on Computer Vision (ECCV) 2020, 2020.
  37. The unreasonable effectiveness of deep features as a perceptual metric. In CVPR, 2018.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Beiwen Tian (13 papers)
  2. Huan-ang Gao (30 papers)
  3. Leiyao Cui (5 papers)
  4. Yupeng Zheng (18 papers)
  5. Lan Luo (22 papers)
  6. Baofeng Wang (1 paper)
  7. Rong Zhi (3 papers)
  8. Guyue Zhou (68 papers)
  9. Hao Zhao (139 papers)
Citations (4)

Summary

We haven't generated a summary for this paper yet.