Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
133 tokens/sec
GPT-4o
12 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

WhaleNet: a Novel Deep Learning Architecture for Marine Mammals Vocalizations on Watkins Marine Mammal Sound Database (2402.17775v2)

Published 20 Feb 2024 in eess.SP, cs.AI, cs.CV, cs.LG, cs.SD, and eess.AS

Abstract: Marine mammal communication is a complex field, hindered by the diversity of vocalizations and environmental factors. The Watkins Marine Mammal Sound Database (WMMD) constitutes a comprehensive labeled dataset employed in machine learning applications. Nevertheless, the methodologies for data preparation, preprocessing, and classification documented in the literature exhibit considerable variability and are typically not applied to the dataset in its entirety. This study initially undertakes a concise review of the state-of-the-art benchmarks pertaining to the dataset, with a particular focus on clarifying data preparation and preprocessing techniques. Subsequently, we explore the utilization of the Wavelet Scattering Transform (WST) and Mel spectrogram as preprocessing mechanisms for feature extraction. In this paper, we introduce \textbf{WhaleNet} (Wavelet Highly Adaptive Learning Ensemble Network), a sophisticated deep ensemble architecture for the classification of marine mammal vocalizations, leveraging both WST and Mel spectrogram for enhanced feature discrimination. By integrating the insights derived from WST and Mel representations, we achieved an improvement in classification accuracy by $8-10\%$ over existing architectures, corresponding to a classification accuracy of $97.61\%$.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (33)
  1. Time-series clustering–a decade review. Information systems, 53:16–38, 2015.
  2. Deep scattering spectrum. IEEE Transactions on Signal Processing, 62(16):4114–4128, 2014.
  3. Kymatio: Scattering transforms in python. Journal of Machine Learning Research, 21(60):1–6, 2020.
  4. Classifying marine mammals signal using cubic splines interpolation combining with triple loss variational auto-encoder. Scientific Reports, 13(1):19984, 2023.
  5. Deep machine learning techniques for the detection and classification of sperm whale bioacoustics. Scientific reports, 9(1):12588, 2019.
  6. Bruna, J. Scattering Representations for Recognition. Theses, Ecole Polytechnique X, February 2013. URL https://pastel.archives-ouvertes.fr/pastel-00905109. Déposée Novembre 2012.
  7. Invariant scattering convolution networks. IEEE transactions on pattern analysis and machine intelligence, 35(8):1872–1886, 2013.
  8. Multiscale sparse microcanonical models. Mathematical Statistics and Learning, 1(3):257–315, 2019.
  9. A new approach to observational cosmology using the scattering transform. Monthly Notices of the Royal Astronomical Society, 499(4):5902–5914, 2020.
  10. Effect of anthropogenic low-frequency noise on the foraging ecology of balaenoptera whales. In Animal Conservation forum, volume 4, pp.  13–27. Cambridge University Press, 2001.
  11. Communication in marine mammals. In Encyclopedia of marine mammals, pp.  260–269. Elsevier, 2009.
  12. Fu, T.-c. A review on time series data mining. Engineering Applications of Artificial Intelligence, 24(1):164–181, 2011.
  13. Global birdsong embeddings enable superior transfer learning for bioacoustic classification. Scientific Reports, 13(1):22876, 2023.
  14. Emerging opportunities and challenges for passive acoustics in ecological assessment and monitoring. Methods in Ecology and Evolution, 10(2):169–185, 2019.
  15. Quantification of maglif morphology using the mallat scattering transformation. Physics of Plasmas, 27(11), 2020.
  16. Hagiwara, M. Aves: Animal vocalization encoder based on self-supervision. In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.  1–5. IEEE, 2023.
  17. Beans: The benchmark of animal sounds. In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.  1–5. IEEE, 2023.
  18. Origins of scale invariance in vocalization sequences and speech. PLoS computational biology, 14(4):e1005996, 2018.
  19. Automatic recognition of animal vocalizations using averaged mfcc and linear discriminant analysis. pattern recognition letters, 27(2):93–101, 2006.
  20. Decoupled weight decay regularization. In International Conference on Learning Representations, 2018.
  21. Detection and classification of marine mammal sounds using alexnet with transfer learning. Ecological Informatics, 62:101277, 2021.
  22. GWpy: A Python package for gravitational-wave astrophysics. SoftwareX, 13:100657, 2021. ISSN 2352-7110. doi: 10.1016/j.softx.2021.100657. URL https://www.sciencedirect.com/science/article/pii/S2352711021000029.
  23. Mallat, S. A wavelet tour of signal processing. Elsevier, 1999.
  24. Mallat, S. Group invariant scattering. Communications on Pure and Applied Mathematics, 65(10):1331–1398, 2012.
  25. Wavelet conditional renormalization group. arXiv preprint arXiv:2207.04941, 2022.
  26. Vocalization based individual classification of humpback whales using support vector machine. In OCEANS 2007, pp.  1–9. IEEE, 2007.
  27. Residual learning for marine mammal classification. IEEE Access, 10:118409–118418, 2022.
  28. Mustill, T. How to Speak Whale: The Power and Wonder of Listening to Animals. Hachette UK, 2022.
  29. Theory and applications of digital speech processing. Prentice Hall Press, 2010.
  30. Digital signal processing. Addison-Wesley Longman Publishing Co., Inc., 1987.
  31. The watkins marine mammal sound database: an online, freely accessible resource. In Proceedings of Meetings on Acoustics, volume 27. AIP Publishing, 2016.
  32. Towards an optimal estimation of cosmological parameters with the wavelet scattering transform. Physical Review D, 105(10):103534, 2022.
  33. Sensory biophysics of marine mammals. Marine Mammal Science, 1(3):219–260, 1985.

Summary

We haven't generated a summary for this paper yet.