Marine Mammal Species Classification using Convolutional Neural Networks and a Novel Acoustic Representation (1907.13188v1)

Published 30 Jul 2019 in cs.SD, cs.LG, eess.AS, and stat.ML

Abstract: Research into automated systems for detecting and classifying marine mammals in acoustic recordings is expanding internationally due to the necessity to analyze large collections of data for conservation purposes. In this work, we present a Convolutional Neural Network that is capable of classifying the vocalizations of three species of whales, non-biological sources of noise, and a fifth class pertaining to ambient noise. In this way, the classifier is capable of detecting the presence and absence of whale vocalizations in an acoustic recording. Through transfer learning, we show that the classifier is capable of learning high-level representations and can generalize to additional species. We also propose a novel representation of acoustic signals that builds upon the commonly used spectrogram representation by way of interpolating and stacking multiple spectrograms produced using different Short-time Fourier Transform (STFT) parameters. The proposed representation is particularly effective for the task of marine mammal species classification where the acoustic events we are attempting to classify are sensitive to the parameters of the STFT.

Citations (41)

View on Semantic Scholar

Summary

The paper presents a novel method using CNNs and stacked interpolated spectrograms for effective classification of marine mammal vocalizations.
The approach leverages architectures like ResNet-50 and VGG-19, achieving superior accuracy, precision, recall, and F-1 score compared to traditional methods.
The study advances non-invasive marine monitoring and lays the groundwork for extending acoustic classification to diverse and challenging environments.

Marine Mammal Species Classification Using Convolutional Neural Networks and a Novel Acoustic Representation

The paper under review presents a paper on the application of Convolutional Neural Networks (CNNs) for the classification of marine mammal species through acoustic data. The research focuses on a novel technique that integrates CNNs with an innovative acoustic representation aimed at enhancing the accuracy and applicability of automated Detection and Classification Systems (DCS) in bioacoustic monitoring. The primary contribution of this work lies in its ability to classify vocalizations from three species of whales, alongside non-biological sources and ambient noise, using a generalizable DCS framework.

Technical Approach

The proposed model employs CNNs trained on spectrograms created through a novel method dubbed Stacked Interpolated Spectrograms. This representation enhances the acoustic signal by interpolating and stacking spectrograms generated using different Short-Time Fourier Transform (STFT) parameters, capturing varying time and frequency resolutions. Such an approach is particularly effective in addressing the varied acoustic patterns encountered in marine environments, which are often sensitive to the STFT parameters.

The CNN-based DCS presented in the paper is capable of classifying blue whales (Balaenoptera musculus), fin whales (Balaenoptera physalus), and sei whales (Balaenoptera borealis). The model's architecture leverages state-of-the-art CNNs such as ResNet-50 and VGG-19, which have proven track records in image classification tasks. The novel acoustic representation, integrated within these architectures, is designed to reduce the necessity for hand-engineered features and increase the generalizability of the classifier across diverse geographical and environmental data sets.

Experimental Results

The experiments demonstrate favorable outcomes using the novel acoustic representation. The method shows statistically significant improvements in classification performance over traditional single-channel spectrogram models, except when compared to VGG-19 with a certain STFT parameter. The multi-channel approach outperforms other configurations with respect to metrics such as accuracy, precision, recall, and F-1 score. Particularly notable is the ability of the system to generalize to additional species, as showcased through the inclusion of humpback whale vocalizations via transfer learning techniques.

Implications and Future Work

This research marks a significant step in advancing automated DCS for marine mammal monitoring. It bears practical implications, particularly for non-invasive wildlife conservation efforts, such as monitoring marine mammal populations and mitigating human impact through informed policy decisions. The model's robustness and adaptability present potential for extension to other acoustic classification tasks, including soundscape ecology and non-mammalian marine bioacoustics.

Future developments could involve optimizing the CNN architecture to reduce computational costs, thus enabling real-time applications on autonomous recording devices or ocean gliders. Exploring unsupervised learning and data augmentation strategies could also enhance the system's ability to train on limited labeled data, thus enabling the classifier to handle more diverse acoustic environments and species. Additionally, investigating waveform-based deep learning methods could increase system efficiency by circumventing data losses inherent in the transformation of waveform data into spectrograms.

In conclusion, this paper propels the field of automatic bioacoustic classification forward, contributing a highly adaptable and effective system with broad applications in environmental monitoring and species conservation.

PDF Markdown