SciCap Dataset for Spacecraft Vision

Updated 11 August 2025

The SciCap dataset is a large-scale annotated collection designed for spacecraft detection and fine-grained part segmentation in space imagery.
It employs a hybrid manual and model-assisted annotation methodology to achieve high-quality, pixel-level segmentation and instance labeling.
Benchmarking with state-of-the-art models demonstrates its effectiveness in advancing algorithms for spacecraft localization and parts recognition.

The SciCap dataset is a large-scale, expertly annotated dataset designed for advancing research in spacecraft detection, instance segmentation, and parts recognition using computer vision—particularly in spaceborne scenarios. Its unique composition, annotation methodology, and benchmarking protocols establish a new standard for space imagery datasets by providing not only spacecraft-level labels but also fine-grained part segmentation masks. The dataset is publicly available at https://github.com/Yurushia1998/SatelliteDataset.

1. Dataset Construction and Annotation Schema

The SciCap dataset contains 3,117 images of satellites and space stations, encompassing both real and synthetic scenes and all standardized to a resolution of 1280×720 pixels. These images collectively annotate 3,667 spacecraft instances. For each instance, dense annotations include bounding boxes and pixel-level segmentation masks, with an additional decomposition at the part level:

Main body
Solar panel
Antenna

This results in a total of 10,350 part segmentation masks, allowing for detailed object-part recognition. The granularity of the labeling goes well beyond standard satellite datasets, which are typically limited to object detection or coarse segmentation. The size distribution of annotated spacecraft spans from approximately 100 pixels to nearly the full image resolution, capturing both small detail and large structures.

2. Iterative Annotation Methodology

Annotation in SciCap is achieved through a hybrid process combining manual effort and automated model-driven refinement in a bootstrapped iterative cycle:

Initial manual annotation: A subset of images is labeled interactively using a Polygon-RNN++-based tool, which enables pixel-accurate polygon editing to decompose each spacecraft into convex regions.
Bootstrapped model-assisted labeling: Early manual masks are used to train segmentation models (e.g., DeepLabV3 pre-trained on ImageNet); these are applied to new images, yielding coarse predictions subject to further manual refinement.
Redundancy elimination: Agglomerative clustering using color histogram features and a chi-square distance metric is deployed for duplicate detection and removal. For each image $I^{(i)}$ , the feature vector $f^{(i)} = [f_1^{(i)}, ..., f_M^{(i)}]^T$ feeds into:

$d(I^{(i)}, I^{(j)}) = \frac{1}{2}\sum_{k=1}^{M}\frac{(f_k^{(i)}-f_k^{(j)})^2}{f_k^{(i)}+f_k^{(j)}}$

Images within a distance threshold are grouped and one is retained for annotation.

Iterative refinement: This cycle repeats, reducing the manual annotation burden as model predictions improve.

This methodology creates high-quality, richly detailed annotation while amortizing the cost of manual segmentation across the dataset’s scale.

3. Benchmarking with State-of-the-Art Approaches

The SciCap dataset is accompanied by extensive benchmarks using contemporary object detection and instance segmentation architectures. Notable results include:

Object detection:
- Models: YOLOv3, YOLOv3-spp, YOLOv4-pacsp, EfficientDet
- Metrics: mAP (mean Average Precision), AP50 (Average Precision at IoU ≥ 0.5)
- Strong result: EfficientDet 7D achieves mAP = 0.880, AP50 = 0.904
Instance and semantic segmentation:
- Models: DeepLabV3+ Xception, ASPOCNET, OCRNet, HRNet, ResneSt-based variants
- Metrics: pixel accuracy (PixAcc), mean Intersection-over-Union (mIoU), including class-specific and background-excluded mIoU
- Segmentation by parts: mIoU computed for each of main body, solar panel, and antenna; the solar panel class achieves the highest mIoU, reflecting its distinctness in the images
Comparison to Earth-based datasets: The results on SciCap are consistently lower than on Cityscapes and Pascal VOC under equivalent models and settings, underscoring the increased complexity of spaceborne imagery for both detection and segmentation.

4. Research and Practical Applications

The SciCap dataset’s annotation density and part-level granularity support multiple critical use cases in the space domain:

Spacecraft detection and localization: Foundational for pose estimation, guidance, docking, and active debris removal.
Part-level instance segmentation: Facilitates automated monitoring of solar panels, antennas, and structural integrity for spacecraft health assessment and in-orbit servicing.
Algorithmic innovation: The dataset enables the development and benchmarking of deep learning approaches explicitly tailored to space imagery, which differs from terrestrial scenes in object scale, background uniformity, and lighting conditions.

By providing both bounding box and structured part-mask data, SciCap encourages research on both coarse and fine-grained recognition under the operational constraints typical in satellite imagery.

5. Data Access, Licensing, and Usage Considerations

The SciCap dataset is publicly available at its GitHub repository (https://github.com/Yurushia1998/SatelliteDataset). While specific licensing terms are not exhaustively detailed in the associated publication, academic and research usage under standard citation and attribution norms is expected. Prospective users should consult the project’s repository for up-to-date terms and any potential restrictions, including citation requirements and attribution modalities.

6. Broader Significance and Limitations

The release of SciCap addresses key bottlenecks in the development of vision-based algorithms for space by creating a benchmark for spacecraft detection and segmentation—an area previously underserved in publicly available data. The bootstrapped annotation protocol exemplifies a scalable strategy for large-scale, pixel-precise segmentation in domains with high annotation cost.

A plausible implication is that as new deep learning architectures are developed using SciCap, researchers may uncover challenges unique to space imagery, such as extreme object scale variance and class imbalance. Additionally, the dataset’s synthesis of real and synthetic data provides an opportunity for evaluating domain adaptation techniques.

In summary, the SciCap dataset constitutes a foundational resource for research and development in space-oriented computer vision. Its detailed part-level annotations, systematic annotation methodology, and rigorous benchmarking make it a model for future datasets in analogous high-consequence, annotation-scarce domains.

PDF Markdown Chat (Pro)

Whiteboard

Generate a whiteboard explanation of this topic.

Topic to Video (Beta)

Generate a video overview of this topic.

Follow Topic

Get notified by email when new papers are published related to SciCap Dataset.