Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

AudioProtoPNet: An interpretable deep learning model for bird sound classification (2404.10420v3)

Published 16 Apr 2024 in cs.LG

Abstract: Deep learning models have significantly advanced acoustic bird monitoring by being able to recognize numerous bird species based on their vocalizations. However, traditional deep learning models are black boxes that provide no insight into their underlying computations, limiting their usefulness to ornithologists and machine learning engineers. Explainable models could facilitate debugging, knowledge discovery, trust, and interdisciplinary collaboration. This study introduces AudioProtoPNet, an adaptation of the Prototypical Part Network (ProtoPNet) for multi-label bird sound classification. It is an inherently interpretable model that uses a ConvNeXt backbone to extract embeddings, with the classification layer replaced by a prototype learning classifier trained on these embeddings. The classifier learns prototypical patterns of each bird species' vocalizations from spectrograms of training instances. During inference, audio recordings are classified by comparing them to the learned prototypes in the embedding space, providing explanations for the model's decisions and insights into the most informative embeddings of each bird species. The model was trained on the BirdSet training dataset, which consists of 9,734 bird species and over 6,800 hours of recordings. Its performance was evaluated on the seven test datasets of BirdSet, covering different geographical regions. AudioProtoPNet outperformed the state-of-the-art model Perch, achieving an average AUROC of 0.90 and a cmAP of 0.42, with relative improvements of 7.1% and 16.7% over Perch, respectively. These results demonstrate that even for the challenging task of multi-label bird sound classification, it is possible to develop powerful yet inherently interpretable deep learning models that provide valuable insights for ornithologists and machine learning engineers.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (38)
  1. Decline of the north american avifauna. Science, 366(6461):120–124, 2019.
  2. Summary for policymakers of the global assessment report on biodiversity and ecosystem services of the intergovernmental science-policy platform on biodiversity and ecosystem services. IPBES Secretariat: Bonn, Germany, pages 22–47, 2019.
  3. The direct drivers of recent global anthropogenic biodiversity loss. Science advances, 8(45):eabm9982, 2022.
  4. Global biodiversity: indicators of recent declines. Science, 328(5982):1164–1168, 2010.
  5. Why birds matter: avian ecological function and ecosystem services. University of Chicago Press, 2019.
  6. Passive acoustic monitoring provides a fresh perspective on fundamental ecological questions. Functional Ecology, 37(4):959–975, 2023.
  7. Autonomous recording units in avian ecological research: current use and future applications. Avian Conservation & Ecology, 12(1), 2017.
  8. Automatic acoustic detection of birds through deep learning: the first bird audio detection challenge. Methods in Ecology and Evolution, 10(3):368–380, 2019.
  9. Birdnet: A deep learning solution for avian diversity monitoring. Ecological Informatics, 61:101236, 2021.
  10. Cynthia Rudin. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature machine intelligence, 1(5):206–215, 2019.
  11. Deep learning techniques: an overview. Advanced Machine Learning Technologies and Applications: Proceedings of AMLTA 2020, pages 599–608, 2021.
  12. Towards relatable explainable ai with the perceptual process. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, pages 1–24, 2022.
  13. Interpreting and explaining deep neural networks for classification of audio signals. arXiv preprint arXiv:1807.03418, 2018.
  14. Relevance-based feature masking: Improving neural network based whale classification through explainable artificial intelligence. 2019.
  15. Detecting deepfake voice using explainable deep learning techniques. Applied Sciences, 12(8):3926, 2022.
  16. A robust interpretable deep learning classifier for heart anomaly detection without segmentation. IEEE Journal of Biomedical and Health Informatics, 25(6):2162–2171, 2020.
  17. “let me explain!”: exploring the potential of virtual agents in explainable ai interaction design. Journal on Multimodal User Interfaces, 15(2):87–98, 2021.
  18. Explainable ai for bearing fault prognosis using deep learning techniques. Micromachines, 13(9):1471, 2022.
  19. Toward interpretable music tagging with self-attention. arXiv preprint arXiv:1906.04972, 2019.
  20. Pay attention to the cough: Early diagnosis of covid-19 using interpretable symptoms embeddings with cough sound signal processing. In Proceedings of the 36th Annual ACM Symposium on Applied Computing, pages 620–628, 2021.
  21. Deep attention-based neural networks for explainable heart sound classification. Machine Learning with Applications, 9:100322, 2022.
  22. Deep learning for case-based reasoning through prototypes: A neural network that explains its predictions. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32, 2018.
  23. This looks like that: deep learning for interpretable image recognition. Advances in neural information processing systems, 32, 2019.
  24. Deformable protopnet: An interpretable image classifier using deformable prototypes. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10265–10275, 2022.
  25. This looks like that, because… explaining prototypes for interpretable image recognition. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pages 441–456. Springer, 2021.
  26. Neural prototype trees for interpretable fine-grained image recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 14933–14943, 2021.
  27. Interpretable image recognition by constructing transparent embedding space. In Proceedings of the IEEE/CVF international conference on computer vision, pages 895–904, 2021.
  28. Pip-net: Patch-based intuitive prototypes for interpretable image classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2744–2753, 2023.
  29. Prototype learning for interpretable respiratory sound analysis. In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 9087–9091. IEEE, 2022.
  30. An interpretable deep learning model for automatic sound classification. Electronics, 10(7):850, 2021.
  31. The interpretability of the activity signal detection model for wood-boring pests semanotus bifasciatus in the larval stage. Pest Management Science, 79(10):3830–3842, 2023.
  32. Wilderness and biodiversity conservation. Proceedings of the National Academy of Sciences, 100(18):10309–10313, 2003.
  33. Explainable artificial intelligence (xai): Concepts, taxonomies, opportunities and challenges toward responsible ai. Information fusion, 58:82–115, 2020.
  34. Peeking inside the black-box: a survey on explainable artificial intelligence (xai). IEEE access, 6:52138–52160, 2018.
  35. Deep learning algorithm outperforms experienced human observer at detection of blue whale d-calls: a double-observer analysis. Remote Sensing in Ecology and Conservation, 9(1):104–116, 2023.
  36. Improving prototypical part networks with reward reweighing, reselection, and retraining. arXiv preprint arXiv:2307.03887, 2023.
  37. ebird: A citizen-based bird observation network in the biological sciences. Biological conservation, 142(10):2282–2292, 2009.
  38. Asymmetric loss for multi-label classification. In Proceedings of the IEEE/CVF international conference on computer vision, pages 82–91, 2021.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets