Revisiting Modality Imbalance In Multimodal Pedestrian Detection
Abstract: Multimodal learning, particularly for pedestrian detection, has recently received emphasis due to its capability to function equally well in several critical autonomous driving scenarios such as low-light, night-time, and adverse weather conditions. However, in most cases, the training distribution largely emphasizes the contribution of one specific input that makes the network biased towards one modality. Hence, the generalization of such models becomes a significant problem where the non-dominant input modality during training could be contributing more to the course of inference. Here, we introduce a novel training setup with regularizer in the multimodal architecture to resolve the problem of this disparity between the modalities. Specifically, our regularizer term helps to make the feature fusion method more robust by considering both the feature extractors equivalently important during the training to extract the multimodal distribution which is referred to as removing the imbalance problem. Furthermore, our decoupling concept of output stream helps the detection task by sharing the spatial sensitive information mutually. Extensive experiments of the proposed method on KAIST and UTokyo datasets shows improvement of the respective state-of-the-art performance.
- “Multispectral pedestrian detection: Benchmark dataset and baseline,” in CVPR, 2015.
- Arindam Das, “Soildnet: Soiling degradation detection in autonomous driving,” Machine Learning for Autonomous Driving Workshop at NeurIPS, 2019.
- “Tiledsoilingnet: Tile-level soiling detection on automotive surround-view cameras using coverage metric,” in ITSC, 2020.
- “Let the sunshine in: Sun glare detection on automotive surround-view cameras,” in Electronic Imaging, 2020.
- “Deep multi-task networks for occluded pedestrian pose estimation,” IMVIP, 2022.
- “Cluenet: A deep framework for occluded pedestrian pose estimation.,” in BMVC, 2019.
- “An end-to-end framework for pose estimation of occluded pedestrians,” in ICIP, 2020, pp. 1446–1450.
- “Depth augmented semantic segmentation networks for automated driving,” in Workshop on Computer Vision Applications. Springer, 2018, pp. 1–13.
- “Spatio-contextual deep network-based multimodal pedestrian detection for autonomous driving,” Transactions on Intelligent Transportation Systems, 2022.
- “Rgb and lidar fusion based 3d semantic segmentation for autonomous driving,” in ITSC, 2019.
- “Improving multispectral pedestrian detection by addressing modality imbalance problems,” in ECCV, 2020.
- “Multispectral object detection for autonomous vehicles,” in Thematic Workshops of ACM Multimedia 2017, 2017, pp. 35–43.
- “Pyramid vision transformer: A versatile backbone for dense prediction without convolutions,” in ICCV, 2021.
- “Repulsion loss: Detecting pedestrians in a crowd,” in CVPR, 2018.
- Leonard Gross, “Logarithmic sobolev inequalities,” American Journal of Mathematics, vol. 97, no. 4, pp. 1061–1083, 1975.
- “Weakly aligned cross-modal learning for multispectral pedestrian detection,” in ICCV, 2019.
- “Multispectral deep neural networks for pedestrian detection,” in BMVC, 2016.
- “Curriculum learning,” in ICML, 2009.
- “Deep residual learning for image recognition,” in CVPR, 2016.
- “Aggregated residual transformations for deep neural networks,” in CVPR, 2017.
- “Pedestrian detection: An evaluation of the state of the art,” Transactions on Pattern Analysis and Machine Intelligence, vol. 34, no. 4, pp. 743–761, 2012.
- “Multispectral pedestrian detection via simultaneous detection and segmentation,” in BMVC, 2020.
- “Attention based multi-layer fusion of multispectral images for pedestrian detection,” IEEE Access, vol. 8, pp. 165071–165084, 2020.
- “Unified multi-spectral pedestrian detection based on probabilistic fusion networks,” in Pattern Recognition, 2018, vol. 80, pp. 143–155.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.