Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
162 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

ML-ASPA: A Contemplation of Machine Learning-based Acoustic Signal Processing Analysis for Sounds, & Strains Emerging Technology (2402.10005v1)

Published 18 Dec 2023 in cs.SD, cs.AI, cs.LG, and eess.AS

Abstract: Acoustic data serves as a fundamental cornerstone in advancing scientific and engineering understanding across diverse disciplines, spanning biology, communications, and ocean and Earth science. This inquiry meticulously explores recent advancements and transformative potential within the domain of acoustics, specifically focusing on ML and deep learning. ML, comprising an extensive array of statistical techniques, proves indispensable for autonomously discerning and leveraging patterns within data. In contrast to traditional acoustics and signal processing, ML adopts a data-driven approach, unveiling intricate relationships between features and desired labels or actions, as well as among features themselves, given ample training data. The application of ML to expansive sets of training data facilitates the discovery of models elucidating complex acoustic phenomena such as human speech and reverberation. The dynamic evolution of ML in acoustics yields compelling results and holds substantial promise for the future. The advent of electronic stethoscopes and analogous recording and data logging devices has expanded the application of acoustic signal processing concepts to the analysis of bowel sounds. This paper critically reviews existing literature on acoustic signal processing for bowel sound analysis, outlining fundamental approaches and applicable machine learning principles. It chronicles historical progress in signal processing techniques that have facilitated the extraction of valuable information from bowel sounds, emphasizing advancements in noise reduction, segmentation, signal enhancement, feature extraction, sound localization, and machine learning techniques...

Definition Search Book Streamline Icon: https://streamlinehq.com
References (50)
  1. S. Hamsa, I. Shahin, Y. Iraqi, E. Damiani, A. B. Nassif, and N. Werghi, “Speaker identification from emotional and noisy speech using learned voice segregation and speech vgg,” Expert Systems with Applications, vol. 224, 2023.
  2. J. H. Hansen, “Analysis and compensation of speech under stress and noise for environmental robustness in speech recognition,” Speech Communication, vol. 20, 1996.
  3. J. H. Hansen and T. Hasan, “Speaker recognition by machines and humans: A tutorial review,” 2015.
  4. G. Hickok and D. Poeppel, “The cortical organization of speech processing,” 2007.
  5. D. Hollfelder, L. Prein, T. Jürgens, A. Leichtle, and K. L. Bruchhage, “Influence of directional microphones on listening effort in middle ear implant users,” HNO, vol. 71, 2023.
  6. Y. Huang, Y. Ma, J. Xiao, W. Liu, and G. Zhang, “Identification of depression state based on multi-scale acoustic features in interrogation environment,” IET Signal Processing, vol. 17, 2023.
  7. K. L. Johnson, T. G. Nicol, and N. Kraus, “Brain stem response to speech: A biological marker of auditory processing,” 2005.
  8. Y. H. Jung, S. K. Hong, H. S. Wang, J. H. Han, T. X. Pham, H. Park, J. Kim, S. Kang, C. D. Yoo, and K. J. Lee, “Flexible piezoelectric acoustic sensors and machine learning for speech processing,” 2020.
  9. K. Khoria, A. T. Patil, and H. A. Patil, “On significance of constant-q transform for pop noise detection,” Computer Speech and Language, vol. 77, 2023.
  10. F. Kong, H. Zhou, Y. Mo, M. Shi, Q. Meng, and N. Zheng, “Comparable encoding, comparable perceptual pattern: Acoustic and electric hearing,” IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 31, 2023.
  11. J. C. Krause and L. D. Braida, “Acoustic properties of naturally produced clear speech at normal speaking rates,” The Journal of the Acoustical Society of America, vol. 115, 2004.
  12. B. S. Krishna and M. N. Semple, “Auditory temporal processing: Responses to sinusoidally amplitude- modulated tones in the inferior colliculus,” Journal of Neurophysiology, vol. 84, 2000.
  13. G. Langner, “Periodicity coding in the auditory system,” 1992.
  14. C. M. Lee and S. S. Narayanan, “Toward detecting emotions in spoken dialogs,” IEEE Transactions on Speech and Audio Processing, vol. 13, 2005.
  15. C. Lenk, P. Hövel, K. Ved, S. Durstewitz, T. Meurer, T. Fritsch, A. Männchen, J. Küller, D. Beer, T. Ivanov, and M. Ziegler, “Neuromorphic acoustic sensing using an adaptive microelectromechanical cochlea with integrated feedback,” Nature Electronics, vol. 6, 2023.
  16. M. A. Little, P. E. McSharry, S. J. Roberts, D. A. Costello, and I. M. Moroz, “Exploiting nonlinear recurrence and fractal scaling properties for voice disorder detection,” BioMedical Engineering Online, vol. 6, 2007.
  17. W. Liu and D. S. Vicario, “Dynamic encoding of phonetic categories in zebra finch auditory forebrain,” Scientific Reports, vol. 13, 2023.
  18. S. Luthra, “Why are listeners hindered by talker variability?” 2023.
  19. J. S. Magnuson and H. C. Nusbaum, “Acoustic differences, listener expectations, and the perceptual accommodation of talker variability,” Journal of Experimental Psychology: Human Perception and Performance, vol. 33, 2007.
  20. S. Markovich, S. Gannot, and I. Cohen, “Multichannel eigenspace beamforming in a reverberant noisy environment with multiple interfering speech signals,” IEEE Transactions on Audio, Speech and Language Processing, vol. 17, 2009.
  21. B. A. Martin and A. Boothroyd, “Cortical, auditory, event-related potentials in response to periodic and aperiodic stimuli with the same spectral envelope,” Ear and Hearing, vol. 20, 1999.
  22. N. D. Merchant, K. M. Fristrup, M. P. Johnson, P. L. Tyack, M. J. Witt, P. Blondel, and S. E. Parks, “Measuring acoustic habitats,” Methods in Ecology and Evolution, vol. 6, 2015.
  23. N. Mesgarani, C. Cheung, K. Johnson, and E. F. Chang, “Phonetic feature encoding in human superior temporal gyrus,” Science, vol. 343, 2014.
  24. L. Meyer, “The neural oscillations of speech processing and language comprehension: state of the art and emerging mechanisms,” 2018.
  25. G. Minelli, G. E. Puglisi, A. Astolfi, C. Hauth, and A. Warzybok, “Objective assessment of binaural benefit from acoustical treatment in real primary school classrooms,” International Journal of Environmental Research and Public Health, vol. 20, 2023.
  26. D. Nagarajan, S. Broumi, and F. Smarandache, “Neutrosophic speech recognition algorithm for speech under stress by machine learning,” Neutrosophic Sets and Systems, vol. 55, 2023.
  27. J. E. Peelle and A. Wingfield, “The neural consequences of age-related hearing loss,” 2016.
  28. J. E. Peelle, “Listening effort: How the cognitive consequences of acoustic challenge are reflected in brain and behavior,” Ear and Hearing, vol. 39, 2018.
  29. D. Poeppel, “Pure word deafness and the bilateral processing of the speech code,” Cognitive Science, vol. 25, 2001.
  30. V. Poluboina, A. Pulikala, and A. N. P. Muthu, “An improved noise reduction technique for enhancing the intelligibility of sinewave vocoded speech: Implication in cochlear implants,” IEEE Access, vol. 11, 2023.
  31. R. B. Randall, “A history of cepstrum analysis and its application to mechanical problems,” Mechanical Systems and Signal Processing, vol. 97, 2017.
  32. M. Ravanelli, P. Brakel, M. Omologo, and Y. Bengio, “Light gated recurrent units for speech recognition,” IEEE Transactions on Emerging Topics in Computational Intelligence, vol. 2, 2018.
  33. T. N. Sainath, R. J. Weiss, K. W. Wilson, B. Li, A. Narayanan, E. Variani, M. Bacchiani, I. Shafran, A. Senior, K. Chin, A. Misra, and C. Kim, “Multichannel signal processing with deep neural networks for automatic speech recognition,” IEEE/ACM Transactions on Audio Speech and Language Processing, vol. 25, 2017.
  34. M. Schonwiesner, R. Rübsamen, and D. Y. V. Cramon, “Hemispheric asymmetry for spectral and temporal processing in the human antero-lateral auditory belt cortex,” European Journal of Neuroscience, vol. 22, 2005.
  35. M. Souden, J. Benesty, and S. Affes, “On optimal frequency-domain multichannel linear filtering for noise reduction,” IEEE Transactions on Audio, Speech and Language Processing, vol. 18, 2010.
  36. E. P. Stephen, Y. Li, S. Metzger, Y. Oganian, and E. F. Chang, “Latent neural dynamics encode temporal context in speech,” 2023.
  37. K. N. Stevens, “Toward a model for lexical access based on acoustic landmarks and distinctive features,” The Journal of the Acoustical Society of America, vol. 111, 2002.
  38. D. Stowell, D. Giannoulis, E. Benetos, M. Lagrange, and M. D. Plumbley, “Detection and classification of acoustic scenes and events,” IEEE Transactions on Multimedia, vol. 17, 2015.
  39. N. Tandon and A. Choudhury, “Review of vibration and acoustic measurement methods for the detection of defects in rolling element bearings,” Tribology International, vol. 32, 1999.
  40. S. Telkemeyer, S. Rossi, S. P. Koch, T. Nierhaus, J. Steinbrink, D. Poeppel, H. Obrig, and I. Wartenburger, “Sensitivity of newborn auditory cortex to the temporal structure of sounds,” Journal of Neuroscience, vol. 29, 2009.
  41. F. Tezcan, H. Weissbart, and A. E. Martin, “A tradeoff between acoustic and linguistic feature encoding in spoken language comprehension,” eLife, vol. 12, 2023.
  42. C. Ufer and H. Blank, “Multivariate analysis of brain activity patterns as a tool to understand predictive processes in speech perception,” Language, Cognition and Neuroscience, 2023.
  43. F. Viola and W. F. Walker, “A spline-based algorithm for continuous time-delay estimation using sampled data,” IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control, vol. 52, 2005.
  44. M. Voola, A. T. Nguyen, A. Wedekind, W. Marinovic, G. Rajan, and D. Tavora-Vieira, “A study of event-related potentials during monaural and bilateral hearing in single-sided deaf cochlear implant users,” Ear and Hearing, vol. 44, 2023.
  45. H. Wakita, “Direct estimation of the vocal tract shape by inverse filtering of acoustic speech waveforms,” IEEE Transactions on Audio and Electroacoustics, vol. 21, 1973.
  46. M. Wu, D. L. Wang, and G. J. Brown, “A multipitch tracking algorithm for noisy speech,” IEEE Transactions on Speech and Audio Processing, vol. 11, 2003.
  47. L. Xu, Y. Tsai, and B. E. Pfingst, “Features of stimulation affecting tonal-speech perception: Implications for cochlear prostheses,” The Journal of the Acoustical Society of America, vol. 112, 2002.
  48. R. Xu, J. Sun, Y. Wang, S. Zhang, W. Zhong, and Z. Wang, “Speech enhancement based on array-processing-assisted distributed fiber acoustic sensing,” IEEE Sensors Journal, vol. 23, 2023.
  49. X. Yang, K. Wang, and S. A. Shamma, “Auditory representations of acoustic signals,” IEEE Transactions on Information Theory, vol. 38, 1992.
  50. K. Zmolikova, M. Delcroix, T. Ochiai, K. Kinoshita, J. Cernocky, and D. Yu, “Neural target speech extraction: An overview,” IEEE Signal Processing Magazine, vol. 40, 2023.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com