Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
166 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

xView: Objects in Context in Overhead Imagery (1802.07856v1)

Published 22 Feb 2018 in cs.CV

Abstract: We introduce a new large-scale dataset for the advancement of object detection techniques and overhead object detection research. This satellite imagery dataset enables research progress pertaining to four key computer vision frontiers. We utilize a novel process for geospatial category detection and bounding box annotation with three stages of quality control. Our data is collected from WorldView-3 satellites at 0.3m ground sample distance, providing higher resolution imagery than most public satellite imagery datasets. We compare xView to other object detection datasets in both natural and overhead imagery domains and then provide a baseline analysis using the Single Shot MultiBox Detector. xView is one of the largest and most diverse publicly available object-detection datasets to date, with over 1 million objects across 60 classes in over 1,400 km2 of imagery.

Citations (289)

Summary

  • The paper introduces xView, a large-scale dataset that advances object detection in satellite imagery with a rigorous three-stage geospatial annotation process.
  • It features over one million objects labeled across 60 classes at 0.3m resolution, significantly surpassing existing public overhead datasets in size and diversity.
  • Baseline analysis with the SSD architecture highlights challenges in multi-resolution detection and sets the stage for future research in real-world geospatial applications.

xView: Objects in Context in Overhead Imagery

The paper introduces xView, a large-scale dataset intended to propel research in object detection within overhead satellite imagery. This dataset addresses several key challenges in computer vision, specifically in the domain of satellite imagery, by incorporating high-resolution data and a diverse set of object classes. xView's development is underpinned by a meticulously crafted process of geospatial category detection and a comprehensive annotation routine executed with multiple stages of quality control.

The xView dataset is distinctive in the field due to its extensive size and complexity, surpassing existing public overhead datasets in terms of resolution and diversity. The data collection process leverages WorldView-3 satellites, providing a resolution of 0.3m, which is notably higher compared to other publicly available datasets in this domain. The dataset comprises over 1 million objects labeled across 60 classes and spans over 1,400 square kilometers. This scale stands in contrast with other datasets like COCO, SpaceNet, and COWC, particularly in the context of overhead imagery, which often grapples with issues like low class count and minimal geographic diversity.

Key Research Contributions

  1. Geospatial Annotation Methodology: A unique three-stage quality control process ensures the high reliability of xView annotations. The pipeline includes initial worker annotation followed by supervisory and expert reviews. This rigorous approach is crucial given the intricacy of object detection in satellite imagery, influenced by variability in angle, light, and occlusion as images are collected from diverse geographic locations.
  2. Diversity and Scale: The xView dataset stands out by offering a substantial number of classes compared to related datasets. With 60 classes, it provides significant coverage of land-use and fine-grained vehicle types, enhancing its utility in practical applications such as economic reporting and disaster response.
  3. Baseline Algorithmic Analysis: By deploying the Single Shot MultiBox Detector (SSD) architecture, the authors establish a performance baseline over xView. The paper reveals the challenges posed by the dataset, as algorithms face difficulties with the diverse scales and contexts presented by the overhead imagery. Notably, different approaches to handling image scale via multi-resolution chipping were shown to significantly affect detection performance.

Implications and Future Directions

The release of xView is poised to influence both theoretical and applied research in computer vision. It is particularly relevant for tasks involving small object detection across massive image datasets. Its applicability extends beyond the academic field, providing foundational data that can drive the development of practical AI tools for real-world challenges such as disaster management and urban planning.

Future research directions fostered by xView's availability include few-shot learning applied to imbalanced and dynamic datasets typical in remote sensing and efforts in domain adaptation to enhance model performance across varied geographic domains. As xView facilitates a deeper exploration into these areas, it is expected that breakthroughs will further refine algorithms' adaptability and efficiency across the myriad of contexts captured in overhead imagery.

The xView dataset represents a substantial resource for both computer vision researchers and practitioners in the geospatial intelligence community, offering a rich platform for innovation in object detection strategies and applications. The prospect of integrating additional classes and diversifying the existing dataset further provides a clear pathway for extending its impacts into the broader field of AI-driven geospatial analysis.