Pedestrian Detection in Thermal Images using Saliency Maps (1904.06859v1)

Published 15 Apr 2019 in cs.CV

Abstract: Thermal images are mainly used to detect the presence of people at night or in bad lighting conditions, but perform poorly at daytime. To solve this problem, most state-of-the-art techniques employ a fusion network that uses features from paired thermal and color images. Instead, we propose to augment thermal images with their saliency maps, to serve as an attention mechanism for the pedestrian detector especially during daytime. We investigate how such an approach results in improved performance for pedestrian detection using only thermal images, eliminating the need for paired color images. For our experiments, we train the Faster R-CNN for pedestrian detection and report the added effect of saliency maps generated using static and deep methods (PiCA-Net and R3-Net). Our best performing model results in an absolute reduction of miss rate by 13.4% and 19.4% over the baseline in day and night images respectively. We also annotate and release pixel level masks of pedestrians on a subset of the KAIST Multispectral Pedestrian Detection dataset, which is a first publicly available dataset for salient pedestrian detection.

Authors (6)

Debasmita Ghose (4 papers)
Shasvat Mukeshkumar Desai (1 paper)
Sneha Bhattacharya (2 papers)
Deep Chakraborty (6 papers)
Madalina Fiterau (16 papers)
Tauhidur Rahman (18 papers)

Citations (81)

View on Semantic Scholar

Summary

Pedestrian Detection in Thermal Images using Saliency Maps

Pedestrian detection in complex environments is a critical task in various fields, including surveillance systems and autonomous driving. Recent advances have centered around the exploitation of multispectral data, predominantly in color images and fusion architectures with thermal images. However, the reliance on paired color and thermal images poses challenges due to the cost and complexity involved in capturing such data. The paper "Pedestrian Detection in Thermal Images using Saliency Maps" introduces a novel method focused solely on the use of thermal images augmented by saliency maps to enhance pedestrian detection.

The paper addresses the inherent limitations of thermal imaging, particularly during the daytime when ambient temperatures often reduce the distinguishability of humans from their surroundings. To overcome these challenges, the authors propose the use of saliency maps as an attention mechanism to highlight pedestrians, thereby improving the efficacy of detection models trained with thermal data alone. They experiment with Faster R-CNN, a state-of-the-art object detection model, applying saliency maps generated through both static and deep learning methods to train and enhance its performance.

Notably, the research demonstrates that saliency maps can significantly reduce the miss rate in pedestrian detection. Utilizing data augmentation techniques with deep networks such as PiCA-Net and $R^3$ -Net, the model showed an absolute reduction in the miss rate by up to 13.4% during the day and 19.4% at night compared to the baseline approaches using only thermal images. This improvement emphasizes the complementary nature of saliency information in thermal-based detection systems.

The paper also provides a valuable contribution to the community by releasing pixel-level annotations of pedestrians on a subset of the KAIST Multispectral Pedestrian Detection dataset. This annotation enhances the research capability by creating the first public dataset tailored for salient pedestrian detection tasks.

The results outlined in the paper present promising implications for practical applications in environments where thermal imaging is advantageous or necessary, such as poor lighting conditions. The use of saliency maps paves the way for the development of more robust and cost-effective pedestrian detection systems that can operate without the need for complex multispectral data acquisition.

Future directions in this domain may explore the refinement of saliency detection through joint learning approaches that integrate saliency maps with standard object detection methods in a unified framework. Additionally, expanding the dataset with more detailed semantic annotations could further improve the precision of saliency networks.

Overall, the work contributes significantly to the field, offering an effective alternative to traditional multispectral fusion methods, and sets the stage for further exploration in intelligent detection systems reliant on single-modality data inputs.

PDF Markdown

Related Papers

YouTube

Show All Videos