Fast Shadow Detection from a Single Image Using a Patched Convolutional Neural Network

Published 26 Sep 2017 in cs.CV | (1709.09283v2)

Abstract: In recent years, various shadow detection methods from a single image have been proposed and used in vision systems; however, most of them are not appropriate for the robotic applications due to the expensive time complexity. This paper introduces a fast shadow detection method using a deep learning framework, with a time cost that is appropriate for robotic applications. In our solution, we first obtain a shadow prior map with the help of multi-class support vector machine using statistical features. Then, we use a semantic- aware patch-level Convolutional Neural Network that efficiently trains on shadow examples by combining the original image and the shadow prior map. Experiments on benchmark datasets demonstrate the proposed method significantly decreases the time complexity of shadow detection, by one or two orders of magnitude compared with state-of-the-art methods, without losing accuracy.

Abstract PDF Upgrade to Chat

Citations (49)

View on Semantic Scholar

Summary

Overview of Fast Shadow Detection Using Patched Convolutional Neural Networks

The paper titled "Fast Shadow Detection from a Single Image Using a Patched Convolutional Neural Network" presents a novel approach designed to improve the computational efficiency of shadow detection algorithms while maintaining high accuracy, particularly for robotic applications. The authors have constructed their method leveraging both machine learning and image processing techniques, targeting the pervasive challenge of shadows in outdoor images.

Methodology

The authors employ a two-step process for efficient shadow detection. Initially, a shadow prior map is generated using a Support Vector Machine (SVM) trained on statistical features derived from color and texture information. This approach segments images into super-pixels, minimizing the cost associated with per-pixel processing. Subsequently, the shadow prior map, combined with the original RGB image, serves as input to a patched Convolutional Neural Network (CNN) that performs semantic-aware training on shadow examples.

The patched-CNN is central to the training and inference processes, operating on super-pixels rather than individual pixels to significantly lower computational cost. The CNN architecture includes six convolutional layers and provides outputs of shadow probability maps for each super-pixel. This is refined further by retraining edge pixels between super-pixels to mitigate boundary artifacts.

Experimental Evaluation

Using benchmark datasets such as UCF, UIUC, and SBU, the authors evidence that their method is substantially faster—orders of magnitude faster—compared to existing deep learning and statistical shadow detection methods, while maintaining comparable accuracy. For instance, the proposed method's testing execution time on these datasets indicates a speed advantage that positions it well for robotics applications, where real-time processing is critical.

Quantitative results highlight the shadow accuracy achieved by the proposed method, often surpassing that of competing methods like the Stacked-CNN and Unary-Pairwise approaches. This demonstrates the proposed method's ability to detect shadow without undermining detection accuracy, pivotal for future developments in real-world scenarios.

Implications and Speculation

In practice, this method can drastically improve the functionality and efficiency of vision systems utilized in robotics. The reduced computational load facilitates real-time shadow detection, which is crucial for applications like autonomous driving and outdoor visual localization where shadows significantly impair scene understanding. The theoretical implications extend into the broader domain of image processing and computer vision, suggesting that leveraging super-pixel processing alongside deep learning could yield efficient solutions in other challenging tasks.

Future developments in AI may further capitalize on this approach by integrating more sophisticated machine learning models or enhancing features for shadow prediction, potentially leading to more generalized scene interpretation systems. As deep learning frameworks advance and become lighter on computational demands, incorporating such techniques into different layers of robotic vision systems seems promising.

Conclusion

The paper provides an in-depth exploration of shadow detection from single images, pioneering a method that ensures efficiency and accuracy by integrating SVM-based shadow prior maps with patched CNN processing. The scalability and performance of this method demonstrate its utility across various applications, setting a foundation for subsequent innovation in the intersection of shadow detection and autonomous robotics systems.