Towards small and accurate convolutional neural networks for acoustic biodiversity monitoring (2312.03666v1)
Abstract: Automated classification of animal sounds is a prerequisite for large-scale monitoring of biodiversity. Convolutional Neural Networks (CNNs) are among the most promising algorithms but they are slow, often achieve poor classification in the field and typically require large training data sets. Our objective was to design CNNs that are fast at inference time and achieve good classification performance while learning from moderate-sized data. Recordings from a rainforest ecosystem were used. Start and end-point of sounds from 20 bird species were manually annotated. Spectrograms from 10 second segments were used as CNN input. We designed simple CNNs with a frequency unwrapping layer (SIMP-FU models) such that any output unit was connected to all spectrogram frequencies but only to a sub-region of time, the Receptive Field (RF). Our models allowed experimentation with different RF durations. Models either used the time-indexed labels that encode start and end-point of sounds or simpler segment-level labels. Models learning from time-indexed labels performed considerably better than their segment-level counterparts. Best classification performances was achieved for models with intermediate RF duration of 1.5 seconds. The best SIMP-FU models achieved AUCs over 0.95 in 18 of 20 classes on the test set. On compact low-cost hardware the best SIMP-FU models evaluated up to seven times faster than real-time data acquisition. RF duration was a major driver of classification performance. The optimum of 1.5 s was in the same range as the duration of the sounds. Our models achieved good classification performance while learning from moderate-sized training data. This is explained by the usage of time-indexed labels during training and adequately sized RF. Results confirm the feasibility of deploying small CNNs with good classification performance on compact low-cost devices.
- “Emerging opportunities and challenges for passive acoustics in ecological assessment and monitoring” In Methods in Ecology and Evolution 10.2 Wiley Online Library, 2019, pp. 169–185
- “Shortcut learning in deep neural networks” In Nature Machine Intelligence 2.11 Nature Publishing Group, 2020, pp. 665–673
- “Underspecification Presents Challenges for Credibility in Modern Machine Learning”, 2020 arXiv:2011.03395 [cs.LG]
- “A review of current marine mammal detection and classification algorithms for use in automated passive acoustic monitoring” In Proceedings of Acoustics 2013, 2013 Citeseer
- “Automatic fish sounds classification” In The Journal of the Acoustical Society of America 143.5 Acoustical Society of America, 2018, pp. 2834–2846
- Nirosha Priyadarshani, Stephen Marsland and Isabel Castro “Automated birdsong recognition in complex acoustic environments: a review” In Journal of Avian Biology 49.5 Wiley Online Library, 2018, pp. jav–01447
- “Frog call classification: a survey” In Artificial Intelligence Review 49.3 Springer, 2018, pp. 375–391
- Dan Stowell “Computational bioacoustics with deep learning: a review and roadmap”, 2021 arXiv:2112.06725 [cs.SD]
- “A pipeline for identification of bird and frog species in tropical soundscape recordings using a convolutional neural network” In Ecological Informatics 59 Elsevier, 2020, pp. 101113
- “Deep neural networks for automated detection of marine mammal species” In Scientific reports 10.1 Nature Publishing Group, 2020, pp. 1–12
- “Two convolutional neural networks for bird detection in audio signals” In 2017 25th European Signal Processing Conference (EUSIPCO), 2017, pp. 1764–1768 IEEE
- “Large-Scale Bird Sound Classification using Convolutional Neural Networks.” In CLEF (working notes) 1866, 2017
- “Fusing shallow and deep learning for bioacoustic bird species classification” In 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP), 2017, pp. 141–145 IEEE
- “Audio Bird Classification with Inception-v4 extended with Time and Time-Frequency Attention Mechanisms.” In CLEF (Working Notes) 1866, 2017
- “Bird sound recognition using a convolutional neural network” In 2018 IEEE 16th International Symposium on Intelligent Systems and Informatics (SISY), 2018, pp. 000295–000300 IEEE
- Mario Lasseck “Audio-based Bird Species Identification with Deep Convolutional Neural Networks.” In CLEF (Working Notes) 2125, 2018
- “Classification of cetacean whistles based on convolutional neural network” In 2018 10th International Conference on Wireless Communications and Signal Processing (WCSP), 2018, pp. 1–5 IEEE
- “Bat detective—Deep learning tools for bat acoustic signal detection” In PLoS computational biology 14.3 Public Library of Science San Francisco, CA USA, 2018, pp. e1005995
- “Deep Learning Techniques for Koala Activity Detection.” In INTERSPEECH, 2018, pp. 2107–2111
- “Automatic classification of grouper species by their sounds using deep neural networks” In The Journal of the Acoustical Society of America 144.3 Acoustical Society of America, 2018, pp. EL196–EL202
- André Araujo, Wade Norris and Jack Sim “Computing Receptive Fields of Convolutional Neural Networks” In Distill, 2019 DOI: 10.23915/distill.00021
- “Understanding the effective receptive field in deep convolutional neural networks” In Proceedings of the 30th International Conference on Neural Information Processing Systems, 2016, pp. 4905–4913
- “Very Deep Convolutional Networks for Large-Scale Image Recognition”, 2015 arXiv:1409.1556 [cs.CV]
- “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications”, 2017 arXiv:1704.04861 [cs.CV]
- “Mobilenetv2: Inverted residuals and linear bottlenecks” In Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 4510–4520
- “Densely connected convolutional networks” In Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 4700–4708
- Diederik P. Kingma and Jimmy Ba “Adam: A Method for Stochastic Optimization”, 2017 arXiv:1412.6980 [cs.LG]