Automatic detection of estuarine dolphin whistles in spectrogram images (1909.04425v1)

Published 10 Sep 2019 in cs.SD, cs.LG, and eess.AS

Abstract: An algorithm for detecting tonal vocalizations from estuarine dolphin (Sotalia guianensis) specimens without interference of a human operator is developed. The raw audio data collected from a passive monitoring sensor in the Canan\'eia underwater soundscape is converted to spectrogram images, containing the desired acoustic event (whistle) as a linear pattern in the images. Detection is a four-step method: first, ridge maps are obtained from the spectrogram images; second, a probabilistic Hough transform algorithm is applied to detect roughly linear ridges, which are adjusted to the true corresponding shape of the whistles via an active contour algorithm; third, feature vectors are built from the geometry of each detected curve; and fourth, the detections are fed to a random forest classifier to parse out false positives. We develop a system capable of reliably classifying roughly 97% of the characteristic patterns detected as Sotalia guianensis whistles or random empty detections.

Citations (1)

View on Semantic Scholar

Summary

The paper introduces an automated method achieving 97.7% accuracy for detecting dolphin whistles in noisy underwater environments.
It employs a multi-step approach including image pre-processing, Hough transform segmentation, and active contour modeling for precise feature extraction.
The research enhances conservation efforts by enabling unobtrusive and efficient monitoring of vulnerable estuarine dolphin populations.

Automatic Detection of Estuarine Dolphin Whistles in Spectrogram Images

In the paper "Automatic Detection of Estuarine Dolphin Whistles in Spectrogram Images," researchers Serra, Martins, and Padovese develop a novel algorithm to autonomously detect vocalizations from estuarine dolphins (Sotalia guianensis) using spectrogram image analysis. The algorithm's architecture encompasses four main steps: image pre-processing, segmentation using Hough transforms and active contours, feature extraction, and random forest classification.

Methodology

Data Collection and Pre-processing

The paper employs passive acoustic monitoring to capture underwater sounds in São Paulo, Brazil. Audio signals are transformed into spectrogram images which visually represent frequency over time. Primary data pre-processing involves enhancing tubular shapes, resembling dolphin whistles, using a Frangi vesselness filter. This highlights the linear acoustic patterns of interest.

Segmentation and Feature Extraction

The segmentation phase utilizes a probabilistic Hough transform to detect initial linear patterns. Following this, an active contour model (snake algorithm) fine-tunes these detections to better match the actual whistle shapes. Six geometric features are computed from the detected shapes: centroid coordinates, normalized length, moment of inertia, and two measures of mass (average and relative).

Classification

The final stage employs a random forest classifier to distinguish between true dolphin whistles and false positives. Three binary features derived from manual observations enhance classifier performance: relative mass, normalized length, and centroid frequency.

Results and Discussion

The classifier was optimized using a grid search methodology and demonstrated an impressive accuracy of 97.7% on the test set. The false positive and false negative rates were 0.034 and 0.016, respectively. The algorithm's robustness is underscored by its applicability in noisy, underwater acoustic environments and its automated performance without manual intervention post data collection.

Implications

Practical Applications

This work facilitates unobtrusive monitoring of dolphin populations, which is crucial given their vulnerability to human activities such as maritime traffic and pollution. Automatic detection systems can significantly reduce manual labor in analyzing acoustic data and enable timely interventions to mitigate human-induced threats.

Theoretical Contributions

By leveraging advanced image processing and machine learning techniques, this methodology advances the field of bioacoustic monitoring. It opens avenues for further research into detecting vocalizations of other cetacean species with potentially different acoustic signatures.

Future Directions

Building upon current achievements, future research could investigate the algorithm's efficacy in multi-species environments. Further refinement could involve enhancing the system's performance in diverse acoustic backgrounds and integrating more sophisticated machine learning models that can adapt to varied vocalization patterns.

In conclusion, this paper makes a substantial contribution to autonomous bioacoustic monitoring, presenting a robust methodology for detecting estuarine dolphin whistles using spectrogram image analysis. The integration of image processing and machine learning techniques offers a valuable tool for ecological studies and conservation efforts.

PDF Markdown

Related Papers

YouTube

Show All Videos