Papers
Topics
Authors
Recent
2000 character limit reached

Beluga Whale Detection from Satellite Imagery with Point Labels (2505.12066v1)

Published 17 May 2025 in cs.CV

Abstract: Very high-resolution (VHR) satellite imagery has emerged as a powerful tool for monitoring marine animals on a large scale. However, existing deep learning-based whale detection methods usually require manually created, high-quality bounding box annotations, which are labor-intensive to produce. Moreover, existing studies often exclude ``uncertain whales'', individuals that have ambiguous appearances in satellite imagery, limiting the applicability of these models in real-world scenarios. To address these limitations, this study introduces an automated pipeline for detecting beluga whales and harp seals in VHR satellite imagery. The pipeline leverages point annotations and the Segment Anything Model (SAM) to generate precise bounding box annotations, which are used to train YOLOv8 for multiclass detection of certain whales, uncertain whales, and harp seals. Experimental results demonstrated that SAM-generated annotations significantly improved detection performance, achieving higher $\text{F}\text{1}$-scores compared to traditional buffer-based annotations. YOLOv8 trained on SAM-labeled boxes achieved an overall $\text{F}\text{1}$-score of 72.2% for whales overall and 70.3% for harp seals, with superior performance in dense scenes. The proposed approach not only reduces the manual effort required for annotation but also enhances the detection of uncertain whales, offering a more comprehensive solution for marine animal monitoring. This method holds great potential for extending to other species, habitats, and remote sensing platforms, as well as for estimating whale biometrics, thereby advancing ecological monitoring and conservation efforts. The codes for our label and detection pipeline are publicly available at http://github.com/voyagerxvoyagerx/beluga-seeker .

Summary

Beluga Whale Detection from Satellite Imagery with Point Labels

The paper presents a novel approach to detecting beluga whales from very high-resolution (VHR) satellite imagery using point labels. The method leverages recent advancements in deep learning to automate and improve the annotation process, allowing efficient detection of marine animals such as beluga whales and harp seals. The primary objective is to reduce the manual labor involved in creating high-quality bounding box annotations, a significant challenge in existing whale detection methodologies.

Methodology

The paper introduces an automated pipeline that utilizes the Segment Anything Model (SAM) to generate bounding box annotations from point labels. SAM, trained on a billion-scale dataset, can produce a mask for each whale using panchromatic satellite imagery. The bounding boxes derived from these masks were subsequently used to train the YOLOv8 model for multiclass detection. The pipeline distinguished between "certain" and "uncertain" whales based on visible features and spatial context, an improvement over models that exclude whales with ambiguous appearances.

The methodology includes image preprocessing, where imagery is converted from 16-bit to 8-bit and cropped into patches for compatibility with YOLOv8. During the annotation, SAM-generated masks are refined by an algorithm that assigns overlapping pixels based on proximity to the annotation points to create distinct bounding boxes for closely packed whales.

Results

The SAM-based approach achieved noteworthy results. Specifically, YOLOv8 trained on SAM-labeled boxes obtained a F1\text{F}_\text{1}-score of 72.2% for whales overall and 70.3% for harp seals. This performance demonstrated significant improvements over traditional buffer-based annotations. The SAM-derived annotations provided better precision and recall, especially for detecting certain whales and harp seals, showing that the proposed method can improve real-world detection scenarios by accurately identifying the shape and size of target marine animals.

The automated box labeling reduced the number of corrections needed for uncertain whales, although challenges persisted due to environmental factors such as sea conditions and submersion. Notably, YOLO-SAM offered superior detection results in densely populated scenes and various resolutions, as corroborated by qualitative and quantitative analysis.

Implications and Future Directions

The paper indicates potential applications of SAM-labeled bounding boxes beyond beluga whale detection. The approach could be adapted to other species and remote sensing platforms, including aerial imagery. SAM-generated masks could serve further ecological research by enabling the estimation of whale biometrics like body length and width, contributing valuable data for conservation efforts.

The promising results underscore the utility of combining general-purpose segmentation models like SAM with state-of-the-art object detection algorithms such as YOLOv8 for efficient marine animal monitoring. The method signifies advancement in ecological monitoring, emphasizing the need for continuous innovation to encompass broader species and habitats.

In conclusion, this paper provides an effective framework for reducing manual annotation efforts while enhancing detection accuracy of marine animals using VHR satellite imagery. Future research may explore expanding the pipeline to include other cetaceans and integrate additional environmental variables to further improve detection precision and reliability.

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Youtube Logo Streamline Icon: https://streamlinehq.com