VI-PANN: Harnessing Transfer Learning and Uncertainty-Aware Variational Inference for Improved Generalization in Audio Pattern Recognition (2401.05531v2)
Abstract: Transfer learning (TL) is an increasingly popular approach to training deep learning (DL) models that leverages the knowledge gained by training a foundation model on diverse, large-scale datasets for use on downstream tasks where less domain- or task-specific data is available. The literature is rich with TL techniques and applications; however, the bulk of the research makes use of deterministic DL models which are often uncalibrated and lack the ability to communicate a measure of epistemic (model) uncertainty in prediction. Unlike their deterministic counterparts, Bayesian DL (BDL) models are often well-calibrated, provide access to epistemic uncertainty for a prediction, and are capable of achieving competitive predictive performance. In this study, we propose variational inference pre-trained audio neural networks (VI-PANNs). VI-PANNs are a variational inference variant of the popular ResNet-54 architecture which are pre-trained on AudioSet, a large-scale audio event detection dataset. We evaluate the quality of the resulting uncertainty when transferring knowledge from VI-PANNs to other downstream acoustic classification tasks using the ESC-50, UrbanSound8K, and DCASE2013 datasets. We demonstrate, for the first time, that it is possible to transfer calibrated uncertainty information along with knowledge from upstream tasks to enhance a model's capability to perform downstream tasks.
- “Multilabel classification of heterogeneous underwater soundscapes with Bayesian deep learning,” IEEE Journal of Oceanic Engineering, vol. 47, no. 4, pp. 1143–1154, 2022.
- “Active Bayesian deep learning with vector sensor for passive sonar sensing of the ocean,” IEEE Journal of Oceanic Engineering, vol. 48, no. 3, pp. 837–852, 2023.
- “Decomposing satellite-based classification uncertainties in large earth science datasets,” IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–11, 2022.
- “Bayesian deep learning for passive microwave precipitation type detection,” IEEE Geoscience and Remote Sensing Letters, 2021.
- “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.
- “PANNs: Large-scale pretrained audio neural networks for audio pattern recognition,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 28, pp. 2880–2894, 2020.
- Karol J. Piczak, “ESC: Dataset for Environmental Sound Classification,” in Proceedings of the 23rd Annual ACM Conference on Multimedia. 2015, pp. 1015–1018, ACM Press.
- “A dataset and taxonomy for urban sound research,” in Proceedings of the 22nd ACM International Conference on Multimedia, New York, NY, USA, 2014, MM ’14, p. 1041–1044, Association for Computing Machinery.
- “Detection and classification of acoustic scenes and events,” IEEE Transactions on Multimedia, vol. 17, no. 10, pp. 1733–1746, 2015.
- Yarin Gal, Uncertainty in Deep Learning, Ph.D. thesis, University of Cambridge, 2016.
- “Decomposition of uncertainty in Bayesian deep learning for efficient and risk-sensitive learning,” in Proceedings of the 35th International Conference on Machine Learning, Jennifer Dy and Andreas Krause, Eds. 10–15 Jul 2018, vol. 80 of Proceedings of Machine Learning Research, pp. 1184–1193, PMLR.
- Lucy R Chai, Uncertainty estimation in Bayesian neural networks and links to interpretability, M.S. thesis, University of Cambridge, 2018.
- Andrew M. Pfau, “Multi-label classification of underwater soundscapes using deep convolutional neural networks,” M.S. thesis, Naval Postgraduate School, Monterey, CA, 2020.
- “Audio set: An ontology and human-labeled dataset for audio events,” in Proc. IEEE ICASSP 2017, New Orleans, LA, 2017.
- “Dropout as a Bayesian approximation: Representing model uncertainty in deep learning,” in international conference on machine learning. PMLR, 2016, pp. 1050–1059.
- “Flipout: Efficient pseudo-independent weight perturbations on mini-batches,” in 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. 2018, OpenReview.net.
- “Efficient end-to-end audio embeddings generation for audio classification on target applications,” in ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021, pp. 601–605.
- “Uncertainty quantification using Bayesian neural networks in classification: Application to biomedical image segmentation,” Computational Statistics & Data Analysis, vol. 142, pp. 106816, 2020.
- “Distributed weight consolidation: A brain segmentation case study,” in Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, December 3-8, 2018, Montréal, Canada, Samy Bengio, Hanna M. Wallach, Hugo Larochelle, Kristen Grauman, Nicolò Cesa-Bianchi, and Roman Garnett, Eds., 2018, pp. 4097–4107.
- “Towards dependable autonomous systems based on bayesian deep learning components,” in 2022 18th European Dependable Computing Conference (EDCC), 2022, pp. 65–72.
- “Stochastic variational inference,” Journal of Machine Learning Research, vol. 14, no. 4, pp. 1303–1347, 2013.
- “Bayesian-torch: Bayesian neural network layers for uncertainty estimation,” https://github.com/IntelLabs/bayesian-torch, Jan. 2022.
- “Variational dropout and the local reparameterization trick,” in Advances in Neural Information Processing Systems, C. Cortes, N. Lawrence, D. Lee, M. Sugiyama, and R. Garnett, Eds. 2015, vol. 28, Curran Associates, Inc.
- “What uncertainties do we need in Bayesian deep learning for computer vision?,” in Advances in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, Eds. 2017, vol. 30, Curran Associates, Inc.
- “A systematic comparison of Bayesian deep learning robustness in diabetic retinopathy tasks,” arXiv preprint arXiv:1912.10481, 2019.
- “Specifying weight priors in Bayesian deep neural networks with empirical Bayes,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2020, vol. 34, pp. 4477–4484.
- “Uncertainty calibration of passive microwave brightness temperatures predicted by bayesian deep learning models,” Artificial Intelligence for the Earth Systems, vol. 2, no. 4, pp. e220056, 2023.
- “ShipsEar: An underwater vessel noise database,” Applied Acoustics, vol. 113, pp. 64–69, Dec. 2016.
- “mixup: Beyond empirical risk minimization,” in International Conference on Learning Representations, 2018.
- “Concrete dropout,” in Neural Information Processing Systems, 2017.