Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Advanced Framework for Animal Sound Classification With Features Optimization (2407.03440v1)

Published 3 Jul 2024 in cs.SD, cs.LG, and eess.AS

Abstract: The automatic classification of animal sounds presents an enduring challenge in bioacoustics, owing to the diverse statistical properties of sound signals, variations in recording equipment, and prevalent low Signal-to-Noise Ratio (SNR) conditions. Deep learning models like Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) have excelled in human speech recognition but have not been effectively tailored to the intricate nature of animal sounds, which exhibit substantial diversity even within the same domain. We propose an automated classification framework applicable to general animal sound classification. Our approach first optimizes audio features from Mel-frequency cepstral coefficients (MFCC) including feature rearrangement and feature reduction. It then uses the optimized features for the deep learning model, i.e., an attention-based Bidirectional LSTM (Bi-LSTM), to extract deep semantic features for sound classification. We also contribute an animal sound benchmark dataset encompassing oceanic animals and birds1. Extensive experimentation with real-world datasets demonstrates that our approach consistently outperforms baseline methods by over 25% in precision, recall, and accuracy, promising advancements in animal sound classification.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (29)
  1. “Acoustic indices for biodiversity assessments: Analyses of bias based on simulated bird assemblages and recommendations for field surveys,” Biological Conservation, vol. 191, pp. 306–312, 2015.
  2. “Soundscape characteristics of an environment: A new ecological indicator of ecosystem health,” Wetland and water resource modeling and assessment, pp. 201–211, 2008.
  3. “Linking ecological condition and the soundscape in fragmented australian forests,” Landscape Ecology, vol. 29, pp. 745–758, 2014.
  4. Thair Nu Phyu, “Survey of classification techniques in data mining,” in IMECS, 2009, vol. 1, pp. 727–731.
  5. “Learning naive bayes classifiers for music classification and retrieval,” in ICPR, 2010, pp. 4589–4592.
  6. “Automatic speech emotion recognition using support vector machine,” in EMEIT, 2011, vol. 2, pp. 621–625.
  7. “Research on sound classification based on svm,” Neural Computing and Applications, vol. 32, pp. 1593–1607, 2020.
  8. “Birds sound classification based on machine learning algorithms,” Asian Journal of RCS, vol. 9, no. 4, pp. 1–11, 2021.
  9. “Mfcc based audio classification using machine learning,” in ICCCNT, 2021, pp. 1–4.
  10. “Bird detection in audio: a survey and a challenge,” in MLSP, 2016, pp. 1–6.
  11. “Deep neural networks: A case study for music genre classification,” in ICMLA, 2015, pp. 655–660.
  12. “Improved speaker recognition system for stressed speech using deep neural networks,” in IJCNN, 2017, pp. 1257–1264.
  13. “Fast environmental sound classification based on resource adaptive convolutional neural network,” Scientific Reports, vol. 12, no. 1, pp. 6599, 2022.
  14. “Sliding-window based scale-frequency map for bird sound classification using 2d-and 3d-cnn,” Expert Systems with Applications, vol. 207, pp. 118054, 2022.
  15. “Cnn-based segmentation and classification of sound streams under realistic conditions,” in Pan-Hellenic, 2022, pp. 373–378.
  16. “A survey of audio-based music classification and annotation,” Multimedia, vol. 13, no. 2, pp. 303–319, 2010.
  17. “Audio classification using acoustic images for retrieval from multimedia databases,” in EURASIP, 2003, vol. 1, pp. 187–192.
  18. “Snore sound classification using image-based deep spectrum features,” 2017.
  19. “Learning image-based representations for heart sound classification,” in ICDH, 2018, pp. 143–147.
  20. “Development on deaf support application based on daily sound classification using image-based deep learning,” JOIV, vol. 6, no. 1-2, pp. 250–255, 2022.
  21. “Challenges with audio classification using image based approaches for health measurement applications,” in MeMeA, 2020, pp. 1–5.
  22. Jonathan T Foote, “Content-based retrieval of music and audio,” in Multimedia storage and archiving systems II, 1997, vol. 3229, pp. 138–147.
  23. Md Sahidullah and Goutam Saha, “Design, analysis and experimental evaluation of block based transformation in mfcc computation for speaker recognition,” Speech communication, vol. 54, no. 4, pp. 543–565, 2012.
  24. Kenneth Ward Church, “Word2vec,” Natural Language Engineering, vol. 23, no. 1, pp. 155–162, 2017.
  25. “Comparative study of pca, ica, lda using svm classifier,” Journal of ETWI, vol. 6, no. 1, pp. 64–68, 2014.
  26. “Comparative study of pca and ica based traffic flow compression,” Journal of HTRD, vol. 4, no. 1, pp. 98–102, 2009.
  27. “Independent comparative study of pca, ica, and lda on the feret data set,” Journal of IST, vol. 15, no. 5, pp. 252–260, 2005.
  28. “t-distributed stochastic neighbor embedding (t-sne): A tool for eco-physiological transcriptomic analysis,” Marine genomics, vol. 51, pp. 100723, 2020.
  29. “Hierarchical clustering,” Introduction to HPC with MPI for Data Science, pp. 195–211, 2016.

Summary

We haven't generated a summary for this paper yet.