Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Unsupervised Acoustic Scene Mapping Based on Acoustic Features and Dimensionality Reduction (2301.00448v2)

Published 1 Jan 2023 in eess.AS, cs.LG, and cs.SD

Abstract: Classical methods for acoustic scene mapping require the estimation of time difference of arrival (TDOA) between microphones. Unfortunately, TDOA estimation is very sensitive to reverberation and additive noise. We introduce an unsupervised data-driven approach that exploits the natural structure of the data. Our method builds upon local conformal autoencoders (LOCA) - an offline deep learning scheme for learning standardized data coordinates from measurements. Our experimental setup includes a microphone array that measures the transmitted sound source at multiple locations across the acoustic enclosure. We demonstrate that LOCA learns a representation that is isometric to the spatial locations of the microphones. The performance of our method is evaluated using a series of realistic simulations and compared with other dimensionality-reduction schemes. We further assess the influence of reverberation on the results of LOCA and show that it demonstrates considerable robustness.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (25)
  1. “Simultaneous localization and mapping: part i,” IEEE Robotics & Automation magazine, vol. 13, no. 2, pp. 99–110, 2006.
  2. “Simultaneous localization of mobile robot and multiple sound sources using microphone array,” in IEEE International Conference on Robotics and Automation, 2009, pp. 29–34.
  3. “Acoustic SLAM,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 26, no. 9, pp. 1484–1498, 2018.
  4. C. Knapp and G. Carter, “The generalized correlation method for estimation of time delay,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 24, no. 4, pp. 320–327, Aug. 1976.
  5. “Performance of time-delay estimation in the presence of room reverberation,” IEEE Transactions on Speech and Audio Processing, vol. 4, no. 2, pp. 148–152, Mar. 1996.
  6. “A robust method for speech signal time-delay estimation in reverberant rooms,” in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 1997, vol. 1, pp. 375–378 vol.1.
  7. “Robust localization in reverberant rooms,” in Microphone Arrays: Signal Processing Techniques and Applications, pp. 157–180. Springer, 2001.
  8. “Time difference of arrival estimation of speech source in a noisy and reverberant environment,” Signal Processing, vol. 85, no. 1, pp. 177–204, Jan. 2005.
  9. “Signal enhancement using beamforming and nonstationarity with applications to speech,” IEEE Transactions on Signal Processing, vol. 49, no. 8, pp. 1614–1626, 2001.
  10. “Multichannel eigenspace beamforming in a reverberant noisy environment with multiple interfering speech signals,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 17, no. 6, pp. 1071–1086, Aug. 2009.
  11. “Data-driven multi-microphone speaker localization on manifolds,” Foundations and Trends in Signal Processing, vol. 14, no. 1–2, pp. 1–161, 2020.
  12. “Local conformal autoencoder for standardized data coordinates,” Proceedings of the National Academy of Sciences, vol. 117, no. 49, pp. 30918–30927, Nov. 2020.
  13. “2d sound-source localization on the binaural manifold,” in 2012 IEEE International Workshop on Machine Learning for Signal Processing. Sept. 2012, IEEE.
  14. “Acoustic space learning for sound-source separation and localization on binaural manifolds,” International Journal of Neural Systems, vol. 25, no. 01, pp. 1440003, Jan. 2015.
  15. “A study on manifolds of acoustic responses,” in Latent Variable Analysis and Signal Separation, pp. 203–210. Springer International Publishing, 2015.
  16. “Semi-supervised sound source localization based on manifold regularization,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 24, no. 8, pp. 1393–1407, Aug. 2016.
  17. “A global geometric framework for nonlinear dimensionality reduction,” Science, vol. 290, no. 5500, pp. 2319–2323, Dec. 2000.
  18. “Non-linear independent component analysis with diffusion maps,” Applied and Computational Harmonic Analysis, vol. 25, no. 2, pp. 226–239, Sept. 2008.
  19. “gpuRIR: A python library for room impulse response simulation with GPU acceleration,” Multimedia Tools and Applications, vol. 80, no. 4, pp. 5653–5671, Oct. 2020.
  20. “Image method for efficiently simulating small-room acoustics,” The Journal of the Acoustical Society of America, vol. 65, no. 4, pp. 943–950, 1979.
  21. “TIMIT acoustic-phonetic continuous speech corpus,” 1993.
  22. Hervé Abdi, “Metric multidimensional scaling (mds): analyzing distance matrices,” Encyclopedia of measurement and statistics, pp. 1–13, 2007.
  23. “Diffusion maps,” Applied and computational harmonic analysis, vol. 21, no. 1, pp. 5–30, 2006.
  24. “Joint speaker localization and array calibration using expectation-maximization,” EURASIP Journal on Audio, Speech, and Music Processing, vol. 2020, no. 1, June 2020.
  25. “A real-time srp-phat source location implementation using stochastic region contraction (src) on a large-aperture microphone array,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2007.
Citations (1)

Summary

We haven't generated a summary for this paper yet.