Exploiting Polarized Material Cues for Robust Car Detection (2401.02606v1)
Abstract: Car detection is an important task that serves as a crucial prerequisite for many automated driving functions. The large variations in lighting/weather conditions and vehicle densities of the scenes pose significant challenges to existing car detection algorithms to meet the highly accurate perception demand for safety, due to the unstable/limited color information, which impedes the extraction of meaningful/discriminative features of cars. In this work, we present a novel learning-based car detection method that leverages trichromatic linear polarization as an additional cue to disambiguate such challenging cases. A key observation is that polarization, characteristic of the light wave, can robustly describe intrinsic physical properties of the scene objects in various imaging conditions and is strongly linked to the nature of materials for cars (e.g., metal and glass) and their surrounding environment (e.g., soil and trees), thereby providing reliable and discriminative features for robust car detection in challenging scenes. To exploit polarization cues, we first construct a pixel-aligned RGB-Polarization car detection dataset, which we subsequently employ to train a novel multimodal fusion network. Our car detection network dynamically integrates RGB and polarization features in a request-and-complement manner and can explore the intrinsic material properties of cars across all learning samples. We extensively validate our method and demonstrate that it outperforms state-of-the-art detection methods. Experimental results show that polarization is a powerful cue for car detection.
- Amari, S.-i. 1993. Backpropagation and stochastic gradient descent method. Neurocomputing.
- Automatic vehicle detection system in different environment conditions using fast R-CNN. Multimedia Tools and Applications.
- Seeing through fog without seeing fog: Deep multimodal sensor fusion in unseen adverse weather. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
- Road scenes analysis in adverse weather conditions by polarization-encoded images and adapted deep learning. In ITSC.
- Cascade R-CNN: High Quality Object Detection and Instance Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI).
- End-to-end object detection with transformers. In Proceedings of the European Conference on Computer Vision (ECCV).
- Pseudo-image and sparse points: Vehicle detection with 2D LiDAR revisited by deep learning-based methods. IEEE Transactions on Intelligent Transportation Systems (TITS).
- Multi-view 3d object detection network for autonomous driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
- ImageNet: A Large-Scale Hierarchical Image Database. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
- An image is worth 16x16 words: Transformers for image recognition at scale. arXiv.
- Weak and occluded vehicle detection in complex infrared environment based on improved YOLOv4. IEEE Access.
- The pascal visual object classes (voc) challenge. International journal of computer vision.
- Yolox: Exceeding yolo series in 2021. arXiv.
- Zero-reference deep curve estimation for low-light image enhancement. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
- Deep residual learning for image recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
- Calibrated RGB-D salient object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.
- Jocher, G. 2023. YOLOv8 - Ultraytics — Revolutionizing the World of Vision AI. https://ultralytics.com/yolov8. Accessed April 23, 2023.
- Polarized 3d: High-quality depth sensing with polarization cues. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV).
- Deep polarization cues for transparent object segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
- Cornernet: Detecting objects as paired keypoints. In Proceedings of the European Conference on Computer Vision (ECCV).
- Polarized reflection removal with perfect alignment in the wild. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
- Reflection separation via multi-bounce polarization state tracing. In Proceedings of the European Conference on Computer Vision (ECCV).
- Focal loss for dense object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV).
- Microsoft coco: Common objects in context. In Proceedings of the European Conference on Computer Vision (ECCV).
- DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR. In International Conference on Learning Representations (ICLR).
- Ssd: Single shot multibox detector. In Proceedings of the European Conference on Computer Vision (ECCV).
- Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV).
- RTMDet: An Empirical Study of Designing Real-Time Object Detectors.
- Radar based object detection and tracking for autonomous driving. In Proceedings of the IEEE MTT-S International Conference on Microwaves for Intelligent Mobility (ICMIM).
- Depth-aware mirror segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.
- Glass segmentation using intensity and spectral polarization cues. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.
- Pytorch: An imperative style, high-performance deep learning library. Proceedings of the Conference on Neural Information Processing Systems (NeurIPS).
- Robust Multimodal Vehicle Detection in Foggy Weather Using Complementary Lidar and Radar Signals. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
- You only look once: Unified, real-time object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
- YOLOv3: An Incremental Improvement.
- Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI).
- Very deep convolutional networks for large-scale image recognition. arXiv.
- Vision-based vehicle detection and counting system using deep learning in highway scenes. European Transport Research Review.
- Sparse r-cnn: End-to-end object detection with learnable proposals. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
- Drone-based RGB-infrared cross-modality vehicle detection via uncertainty-aware learning. IEEE Transactions on Circuits and Systems for Video Technology (TCSVT).
- Efficientnet: Rethinking model scaling for convolutional neural networks. In Proceedings of the International Conference on Machine Learning (ICML).
- Efficientdet: Scalable and efficient object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
- Fcos: Fully convolutional one-stage object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV).
- There is more than meets the eye: Self-supervised multi-object detection and tracking with sound by distilling multimodal knowledge. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
- Attention is all you need. Proceedings of the Conference on Neural Information Processing Systems (NeurIPS).
- YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv.
- CSPNet: A new backbone that can enhance learning capability of CNN. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshop (CVPRW).
- Separating reflection and transmission images in the wild. In Proceedings of the European Conference on Computer Vision (ECCV).
- Big bird: Transformers for longer sequences. In Proceedings of the Conference on Neural Information Processing Systems (NeurIPS).
- Dynamic R-CNN: Towards high quality object detection via dynamic training. In Proceedings of the European Conference on Computer Vision (ECCV).
- DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection.
- Varifocalnet: An iou-aware dense object detector. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
- A hybrid attention-aware fusion network (HAFNet) for building extraction from high-resolution imagery and LiDAR data. Remote Sensing.
- Deformable detr: Deformable transformers for end-to-end object detection. arXiv.