Papers

Topics

Authors

Recent

View all

Detailed Answer

Quick Answer

Concise responses based on abstracts only

Detailed Answer

Well-researched responses based on abstracts and relevant paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses

Gemini 2.5 Flash

Gemini 2.5 Flash 48 tok/s

Gemini 2.5 Pro 48 tok/s Pro

GPT-5 Medium 26 tok/s Pro

GPT-5 High 19 tok/s Pro

GPT-4o 107 tok/s Pro

Kimi K2 205 tok/s Pro

GPT OSS 120B 473 tok/s Pro

Claude Sonnet 4 37 tok/s Pro

2000 character limit reached

Nazr-CNN: Fine-Grained Classification of UAV Imagery for Damage Assessment (1611.06474v2)

Published 20 Nov 2016 in cs.CV

Abstract: We propose Nazr-CNN1, a deep learning pipeline for object detection and fine-grained classification in images acquired from Unmanned Aerial Vehicles (UAVs) for damage assessment and monitoring. Nazr-CNN consists of two components. The function of the first component is to localize objects (e.g. houses or infrastructure) in an image by carrying out a pixel-level classification. In the second component, a hidden layer of a Convolutional Neural Network (CNN) is used to encode Fisher Vectors (FV) of the segments generated from the first component in order to help discriminate between different levels of damage. To showcase our approach we use data from UAVs that were deployed to assess the level of damage in the aftermath of a devastating cyclone that hit the island of Vanuatu in 2015. The collected images were labeled by a crowdsourcing effort and the labeling categories consisted of fine-grained levels of damage to built structures. Since our data set is relatively small, a pre- trained network for pixel-level classification and FV encoding was used. Nazr-CNN attains promising results both for object detection and damage assessment suggesting that the integrated pipeline is robust in the face of small data sets and labeling errors by annotators. While the focus of Nazr-CNN is on assessment of UAV images in a post-disaster scenario, our solution is general and can be applied in many diverse settings. We show one such case of transfer learning to assess the level of damage in aerial images collected after a typhoon in Philippines.

Citations (44)

View on Semantic Scholar

Summary

The paper presents a two-phase deep learning pipeline combining semantic segmentation and Fisher Vector encoding to achieve fine-grained UAV imagery damage assessment.
It utilizes pre-trained networks like VGG-M and DeepLab with SVM classifiers to overcome challenges such as noisy, imbalanced data and varying object scales.
Evaluation on diverse datasets, including transfer learning on Philippines data, demonstrates the method's high precision and robustness for disaster response.

Nazr-CNN: Fine-Grained Classification of UAV Imagery for Damage Assessment

Introduction

Nazr-CNN is presented as a deep learning pipeline specifically tuned for performing object detection and fine-grained classification of imagery acquired from Unmanned Aerial Vehicles (UAVs) in disaster-struck regions. The pipeline consists of two main components: pixel-level classification for object localization and Fisher Vector (FV) encoding through a Convolutional Neural Network (CNN's) hidden layer for texture discrimination among damage levels. This is particularly useful in post-disaster scenarios where rapid damage assessment is critical for effective response and resource allocation.

Figure 1: UAV image acquisition and annotation workflow implemented in MicroMappers for this paper, following the approach used for text classification by AIDR\protect\footnotemark.

Problem Definition and Constraints

The problem addressed is the automatic classification of images into damage categories using UAV-sourced aerial views. The constraints involve handling the heterogeneity of backgrounds and variable object scales due to varying UAV altitudes, designing classifiers robust to noisy and limited data, and overcoming annotation discrepancies. The goal is to achieve precise object detection where structures need classification into categories: Mild, Medium, Severe, or Background.

Deep Learning Framework

Pre-Trained Networks

The usage of pre-trained networks such as VGG-M is emphasized, adapting weight initializations for feature extraction due to limited data availability. This approach ensures computational efficiency while leveraging established networks' strengths for handling aerial imagery.

Semantic Segmentation

The first pipeline utilizes DeepLab, which integrates CNNs with Conditional Random Fields (CRFs) to enhance segmentation accuracy by refining boundaries and addressing noise. DeepLab's framework uses fully connected CRFs to focus on local shape extraction over mere image smoothening.

Fisher Vector Encoding

Fisher Vectors extend the traditional Bag-of-Visual-Words model, applying Gaussian Mixture Models to CNN-extracted features, thus allowing for detailed texture-based classification. This encoding is critical for discriminating between various damage levels, pivotal for effective disaster response.

Figure 2: Bag of Visual Words (BoV) vs. Fisher Vector (FV) representation.

Proposed Pipeline

The proposed pipeline is a two-phase process: semantic segmentation for localization followed by FV-encoded CNN processing for classification, with SVM used as a classifier for the extracted segments. Integrating both ensures geometry and texture are leveraged for effective damage assessment.

Figure 3: \ combines pixel-level classification with FV-CNN. The Fisher Vectors are then trained using multi-class SVM.

Experiments and Evaluation

Semantic Segmentation Results

DeepLab achieves impressive localization outcomes but requires class weighting for improved accuracy across imbalanced damage categories. Class weighting proved to enhance object detection and damage categorization accuracy significantly.

Integrated Pipeline Effectiveness

Nazr-CNN's integration of semantic segmentation with FV-CNN exceeded baseline results, demonstrating robustness against labeling errors and height variability inherent in UAV imagery. Precision-recall metrics reinforce the model's effective fine-grained classification across damage classes.

Figure 4: The bar chart shows the damage-class distribution among 4253 bounding boxes in a total of 1085 image dataset.

Transfer Learning and Generalization

Nazr-CNN also exhibited promising results when applied to a new, heterogeneous dataset from the Philippines, illustrating its ability to generalize beyond its training environment. This adaptation signifies its transferability to diverse disaster contexts, though challenges remain in fine-tuning for different textures and landscape environments.

Figure 5: Transfer learning on Philippines data.

Conclusion

Nazr-CNN emerges as a robust system for UAV imagery damage assessment, leveraging deep learning's potential to address challenges in disaster management environments. Future research directions include refining end-to-end segmentation techniques to further mitigate annotation noise and ambiguity, enhancing system performance and adaptability.