- The paper introduces a two-stage UAV-based framework that combines deep learning detection with vertical plane mapping to accurately count windows.
- It leverages ShufflenetV2 and post-processing techniques such as Non-Maximum Suppression to significantly improve detection precision on complex facades.
- Results demonstrate enhanced accuracy in window detection and counting, offering valuable insights for structural monitoring and disaster risk assessment.
Automated Detection and Counting of Windows Using UAV Imagery-Based Remote Sensing
The inspection and monitoring of buildings, especially for features such as windows, have traditionally been manual processes. Given the structural implications of the number and layout of windows on a building's facade, especially in the context of natural calamities like earthquakes, there is a compelling need for automated, accurate methods. The paper, "Automated Detection and Counting of Windows Using UAV Imagery-Based Remote Sensing," proposes a comprehensive method leveraging UAVs and advanced computer vision algorithms to address this requirement.
Methodology
The proposed method is a two-stage process designed to automate the detection and counting of windows on building facades using data from UAVs. The first stage focuses on window detection utilizing a Deep Learning Network, specifically ShufflenetV2, and a subsequent post-processing module to refine the detection accuracy. The second stage involves a vertical plane mapping algorithm that exploits UAV orientation and positioning data to aggregate window counts while eliminating duplicates arising from overlapping image frames.
Data Collection
The dataset involves images captured using a DJI Tello micro-UAV which is equipped with various onboard sensors like a fixed front-facing camera, barometer, GPS, and an IMU. The UAV captures sequential images as it ascends vertically, ensuring comprehensive facade coverage. The dataset includes annotated images from buildings within the IIIT Hyderabad campus, which are subsequently used for model training and validation.
Window Detection: Deep Learning and Post-Processing
The detection stage employs a heatmap-based technique with ShufflenetV2 to identify windows on facades. To improve generalization and reduce data bias, the network was trained on a combined dataset comprising 3220 images from the zju facade jcst-2020 dataset and the IIIT-H dataset. Despite the pre-trained model initially performing poorly on the UAV images from IIIT-H, incorporating the newly captured UAV data led to significant accuracy improvements.
Post-processing involves template matching and Non-Maximum Suppression (NMS) to eliminate redundant detections. This mechanism steps through the identified windows, ensuring uniformity in window structures within the same storey.
Vertical Plane Mapping and Count Estimation
This stage maps the detected windows onto a vertical plane to consistently count unique windows and estimate storeys. The algorithm accounts for the pitch angle of the UAV during image capture, thereby correcting the vertical coordinates of detected windows. By leveraging the position and IMU data of the UAV, the algorithm projects the window coordinates onto a common 2D vertical plane, ensuring accurate cumulative counts and preventing duplication. The storey count is then derived based on the unique horizontal levels of windows.
Results and Discussion
The evaluation shows that the proposed method substantially increases the detection accuracy compared to baseline models. Incorporating the IIIT-H dataset into the training significantly improved the model’s performance metrics, achieving precision and recall improvements across various buildings. Post-processing further enhanced detection accuracy, ensuring almost complete accuracy in window detection for different sequences, as demonstrated in Table II.
The vertical plane mapping algorithm successfully aggregated the window counts, demonstrating robustness in counting despite overlapping frames. For instance, the Bakul building's facade analysis revealed a window-to-facade area ratio of 16.7%, highlighting the method's capability to derive structural properties essential for risk assessment.
Conclusion
This research presents a robust, automated approach to detect and count windows on building facades using UAV-based remote sensing. The method effectively addresses challenges such as overlapping images and varying facade geometries by combining deep learning with geometric correction algorithms. Future work could explore real-time processing capabilities using onboard computational resources, making the system more versatile and immediately deployable for field inspections and disaster management applications.