Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

HOMULA-RIR: A Room Impulse Response Dataset for Teleconferencing and Spatial Audio Applications Acquired Through Higher-Order Microphones and Uniform Linear Microphone Arrays (2402.13896v1)

Published 21 Feb 2024 in eess.AS and eess.SP

Abstract: In this paper, we present HOMULA-RIR, a dataset of room impulse responses (RIRs) acquired using both higher-order microphones (HOMs) and a uniform linear array (ULA), in order to model a remote attendance teleconferencing scenario. Specifically, measurements were performed in a seminar room, where a 64-microphone ULA was used as a multichannel audio acquisition system in the proximity of the speakers, while HOMs were used to model 25 attendees actually present in the seminar room. The HOMs cover a wide area of the room, making the dataset suitable also for applications of virtual acoustics. Through the measurement of the reverberation time and clarity index, and sample applications such as source localization and separation, we demonstrate the effectiveness of the HOMULA-RIR dataset.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (26)
  1. M. Olivieri, L. Comanducci, M. Pezzoli, D. Balsarri, L. Menescardi, M. Buccoli, S. Pecorino, A. Grosso, F. Antonacci, and A. Sarti, “Real-time multichannel speech separation and enhancement using a beamspace-domain-based lightweight cnn,” in ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5, IEEE, 2023.
  2. Y. Hsu, Y. Lee, and M. R. Bai, “Learning-based personal speech enhancement for teleconferencing by exploiting spatial-spectral features,” in ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 8787–8791, IEEE, 2022.
  3. S. E. Eskimez, T. Yoshioka, H. Wang, X. Wang, Z. Chen, and X. Huang, “Personalized speech enhancement: New models and comprehensive evaluation,” in ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 356–360, IEEE, 2022.
  4. K. Sridhar, R. Cutler, A. Saabas, T. Parnamaa, M. Loide, H. Gamper, S. Braun, R. Aichner, and S. Srinivasan, “Icassp 2021 acoustic echo cancellation challenge: Datasets, testing framework, and results,” in ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 151–155, IEEE, 2021.
  5. C. Chen, W. Sun, D. Harwath, and K. Grauman, “Learning audio-visual dereverberation,” in ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5, IEEE, 2023.
  6. L. Diener, S. Sootla, S. Branets, A. Saabas, R. Aichner, and R. Cutler, “Interspeech 2022 audio deep packet loss concealment challenge,” arXiv preprint arXiv:2204.05222, 2022.
  7. A. I. Mezza, M. Amerena, A. Bernardini, and A. Sarti, “Hybrid packet loss concealment for real-time networked music applications,” IEEE Open Journal of Signal Processing, vol. 5, pp. 266–273, 2024.
  8. F. Miotello, M. Pezzoli, L. Comanducci, F. Antonacci, and A. Sarti, “Deep prior-based audio inpainting using multi-resolution harmonic convolutional neural networks,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2023.
  9. M. Pezzoli, F. Borra, F. Antonacci, S. Tubaro, and A. Sarti, “A parametric approach to virtual miking for sources of arbitrary directivity,” IEEE/ACM Trans. Acoust., Speech, Signal Process., vol. 28, pp. 2333–2348, 2020.
  10. L. McCormack, A. Politis, R. Gonzalez, T. Lokki, and V. Pulkki, “Parametric ambisonic encoding of arbitrary microphone arrays,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 30, pp. 2062–2075, 2022.
  11. X. Karakonstantis and E. Fernandez-Grande, “Generative adversarial networks with physical sound field priors,” The Journal of the Acoustical Society of America, vol. 154, no. 2, pp. 1226–1238, 2023.
  12. E. Hadad, F. Heese, P. Vary, and S. Gannot, “Multichannel audio database in various acoustic environments,” in 2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC), pp. 313–317, IEEE, 2014.
  13. S. Koyama, T. Nishida, K. Kimura, T. Abe, N. Ueno, and J. Brunnström, “Meshrir: A dataset of room impulse responses on meshed grid points for evaluating sound field analysis and synthesis methods,” in 2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 1–5, IEEE, 2021.
  14. T. Dietzen, R. Ali, M. Taseska, and T. van Waterschoot, “Myriad: a multi-array room acoustic database,” EURASIP Journal on Audio, Speech, and Music Processing, vol. 2023, no. 1, pp. 1–14, 2023.
  15. R. Stewart and M. Sandler, “Database of omnidirectional and b-format room impulse responses,” in 2010 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 165–168, IEEE, 2010.
  16. J.-W. Choi and F. Zotter, “Six degrees-of-freedom room impulse response dataset measured over a dense loudspeaker grid (6drir-dl),” in Audio Engineering Society Conference: AES 2023 International Conference on Spatial and Immersive Audio, Audio Engineering Society, 2023.
  17. G. Götz, S. J. Schlecht, and V. Pulkki, “A dataset of higher-order ambisonic room impulse responses and 3d models measured in a room with varying furniture,” in 2021 Immersive and 3D Audio: from Architecture to Automotive (I3DA), pp. 1–8, IEEE, 2021.
  18. Springer Nature, 2019.
  19. M. Pezzoli, L. Comanducci, J. Waltz, A. Agnello, L. Bondi, A. Canclini, and A. Sarti, “A dante powered modular microphone array system,” in Audio Engineering Society Convention 145, Audio Engineering Society, 2018.
  20. E. M. Benjamin, “A second-order soundfield microphone with improved polar pattern shape,” in Audio Engineering Society Convention 133, Audio Engineering Society, 2012.
  21. R. Scheibler, E. Bezzam, and I. Dokmanić, “Pyroomacoustics: A python package for audio room simulation and array processing algorithms,” in 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp. 351–355, IEEE, 2018.
  22. P. Götz, C. Tuna, A. Walther, and E. A. Habets, “Online reverberation time and clarity estimation in dynamic acoustic conditions,” The Journal of the Acoustical Society of America, vol. 153, no. 6, pp. 3532–3542, 2023.
  23. M. Pezzoli, J. J. Carabias-Orti, M. Cobos, F. Antonacci, and A. Sarti, “Ray-space-based multichannel nonnegative matrix factorization for audio source separation,” IEEE Signal Processing Letters, vol. 28, pp. 369–373, 2021.
  24. L. Bianchi, F. Antonacci, A. Sarti, and S. Tubaro, “The ray space transform: A new framework for wave field processing,” IEEE Transactions on Signal Processing, vol. 64, no. 21, pp. 5696–5706, 2016.
  25. E. Vincent, R. Gribonval, and C. Févotte, “Performance measurement in blind audio source separation,” IEEE transactions on audio, speech, and language processing, vol. 14, no. 4, pp. 1462–1469, 2006.
  26. M. Cobos, M. Pezzoli, F. Antonacci, and A. Sarti, “Acoustic source localization in the spherical harmonics domain exploiting low-rank approximations,” in Int. Conf. Acoust. Speech Signal Process, pp. 1–5, IEEE, 2023.
Citations (2)

Summary

We haven't generated a summary for this paper yet.