Papers
Topics
Authors
Recent
Search
2000 character limit reached

HAAQI-Net: A Non-intrusive Neural Music Audio Quality Assessment Model for Hearing Aids

Published 2 Jan 2024 in eess.AS, cs.LG, and cs.SD | (2401.01145v5)

Abstract: This paper introduces HAAQI-Net, a non-intrusive deep learning-based music audio quality assessment model for hearing aid users. Unlike traditional methods like the Hearing Aid Audio Quality Index (HAAQI) that require intrusive reference signal comparisons, HAAQI-Net offers a more accessible and computationally efficient alternative. By utilizing a Bidirectional Long Short-Term Memory (BLSTM) architecture with attention mechanisms and features extracted from the pre-trained BEATs model, it can predict HAAQI scores directly from music audio clips and hearing loss patterns. Experimental results demonstrate HAAQI-Net's effectiveness, achieving a Linear Correlation Coefficient (LCC) of 0.9368 , a Spearman's Rank Correlation Coefficient (SRCC) of 0.9486 , and a Mean Squared Error (MSE) of 0.0064 and inference time significantly reduces from 62.52 to 2.54 seconds. To address computational overhead, a knowledge distillation strategy was applied, reducing parameters by 75.85% and inference time by 96.46%, while maintaining strong performance (LCC: 0.9071 , SRCC: 0.9307 , MSE: 0.0091 ). To expand its capabilities, HAAQI-Net was adapted to predict subjective human scores like the Mean Opinion Score (MOS) through fine-tuning. This adaptation significantly improved prediction accuracy, validated through statistical analysis. Furthermore, the robustness of HAAQI-Net was evaluated under varying Sound Pressure Level (SPL) conditions, revealing optimal performance at a reference SPL of 65 dB, with accuracy gradually decreasing as SPL deviated from this point. The advancements in subjective score prediction, SPL robustness, and computational efficiency position HAAQI-Net as a scalable solution for music audio quality assessment in hearing aid applications, contributing to efficient and accurate models in audio signal processing and hearing aid technology.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (22)
  1. “Music perception in adult cochlear implant recipients,” Acta otolaryngologica, vol. 123, no. 7, pp. 826–835, 2003.
  2. B. Edwards, “The Future of Hearing Aid Technology,” Trends in Amplification, vol. 11, no. 1, pp. 31–45, 2007.
  3. “Performance measurement in blind audio source separation,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 14, no. 4, pp. 1462–1469, 2006.
  4. “The Hearing-Aid Audio Quality Index (HAAQI),” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 24, no. 2, pp. 354–365, 2015.
  5. P ITU, “800: Methods for subjective determination of transmission quality,” Recommendation ITU-T, 1996.
  6. ITU-T Recommendation, “Perceptual Evaluation of Speech Quality (PESQ): An Objective Method for End-to-End Speech Quality Assessment of Narrow-Band Telephone Networks and Speech Codecs,” Rec. ITU-T P. 862, 2001.
  7. “An algorithm for intelligibility prediction of time–frequency weighted noisy speech,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 19, no. 7, pp. 2125–2136, 2011.
  8. “PEAQ-The ITU Standard for Objective Measurement of Perceived Audio Quality,” Journal of the Audio Engineering Society, vol. 48, no. 1/2, pp. 3–29, 2000.
  9. R. Huber and B. Kollmeier, “PEMO-Q—A New Method for Objective Audio Quality Assessment Using a Model of Auditory Perception,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 14, no. 6, pp. 1902–1911, 2006.
  10. “Intrusive and Non-Intrusive Perceptual Speech Quality Assessment Using a Convolutional Neural Network,” in 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2019, pp. 85–89.
  11. “Non-Reference Audio Quality Assessment for Online Live Music Recordings,” in Proceedings of the 21st ACM international conference on Multimedia, 2013, pp. 63–72.
  12. “Quality-Net: An End-to-End Non-intrusive Speech Quality Assessment Model Based on BLSTM,” in Proc. Interspeech 2018, 2018, pp. 1873–1877.
  13. “NISQA: A Deep CNN-Self-Attention Model for Multidimensional Speech Quality Prediction with Crowdsourced Datasets,” in Proc. Interspeech 2021, 2021, pp. 2127–2131.
  14. “Deep Learning-based Non-Intrusive Multi-Objective Speech Assessment Model with Cross-Domain Features,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 31, pp. 54–70, 2022.
  15. “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
  16. “BEATs: Audio Pre-Training with Acoustic Tokenizers,” in Proceedings of the 40th International Conference on Machine Learning. 2023, ICML’23, JMLR.org.
  17. “FMA: A dataset for music analysis,” in International Society for Music Information Retrieval Conference, 2016.
  18. “The MTG-Jamendo Dataset for Automatic Music Tagging,” in Machine Learning for Music Discovery Workshop, International Conference on Machine Learning, Long Beach, CA, United States, 2019.
  19. “Music Through the Ages: Trends in Musical Engagement and Preferences from Adolescence Through Middle Adulthood,” Journal of Personality and Social Psychology, vol. 105, pp. 703–717, 2013.
  20. “Effects of noise, nonlinear processing, and linear filtering on perceived speech quality,” Ear and hearing, vol. 31, no. 3, pp. 420–436, 2010.
  21. “Classification of hearing loss,” Update On Hearing Loss, vol. 4, pp. 29–37, 2015.
  22. D. Byrne and H. Dillon, “The National Acoustic Laboratories’(NAL) New Procedure for Selecting the Gain and Frequency Response of a Hearing Aid,” Ear and hearing, vol. 7, no. 4, pp. 257–265, 1986.

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 2 tweets with 1 like about this paper.