Room impulse response reconstruction with physics-informed deep learning (2401.01206v1)
Abstract: A method is presented for estimating and reconstructing the sound field within a room using physics-informed neural networks. By incorporating a limited set of experimental room impulse responses as training data, this approach combines neural network processing capabilities with the underlying physics of sound propagation, as articulated by the wave equation. The network's ability to estimate particle velocity and intensity, in addition to sound pressure, demonstrates its capacity to represent the flow of acoustic energy and completely characterise the sound field with only a few measurements. Additionally, an investigation into the potential of this network as a tool for improving acoustic simulations is conducted. This is due to its profficiency in offering grid-free sound field mappings with minimal inference time. Furthermore, a study is carried out which encompasses comparative analyses against current approaches for sound field reconstruction. Specifically, the proposed approach is evaluated against both data-driven techniques and elementary wave-based regression methods. The results demonstrate that the physics-informed neural network stands out when reconstructing the early part of the room impulse response, while simultaneously allowing for complete sound field characterisation in the time domain.
- D. De Vries and M. M. Boone, “Wave field synthesis and analysis using array technology,” in Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA’99 (Cat. No. 99TH8452), IEEE (1999), pp. 15–18.
- S. A. Verburg and E. Fernandez-Grande, “Reconstruction of the sound field in a room using compressive sensing,” The Journal of the Acoustical Society of America 143(6), 3770–3779 (2018).
- M. Nolan, S. A. Verburg, J. Brunskog, and E. Fernandez-Grande, “Experimental characterization of the sound field in a reverberation room,” The Journal of the Acoustical Society of America 145(4), 2237–2246 (2019).
- S. Spors, H. Wierstorf, A. Raake, F. Melchior, M. Frank, and F. Zotter, “Spatial sound with loudspeakers and its perception: A review of the current state,” Proceedings of the IEEE 101(9), 1920–1938 (2013).
- O. Kirkeby, P. A. Nelson, F. Orduna-Bustamante, and H. Hamada, “Local sound field reproduction using digital signal processing,” The Journal of the Acoustical Society of America 100(3), 1584–1593 (1996).
- J. Ahrens, “Auralization of omnidirectional room impulse responses based on the spatial decomposition method and synthetic spatial data,” in ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE (2019), pp. 146–150.
- T. Betlehem, W. Zhang, M. A. Poletti, and T. D. Abhayapala, “Personal sound zones: Delivering interface-free audio to multiple listeners,” IEEE Signal Processing Magazine 32(2), 81–91 (2015).
- M. B. Møller and M. Olsen, “Sound zones: On performance prediction of contrast control methods,” in Audio Engineering Society Conference: 2016 AES International Conference on Sound Field Control, Audio Engineering Society (2016).
- P. Coleman, P. J. Jackson, M. Olik, M. Møller, M. Olsen, and J. Abildgaard Pedersen, “Acoustic contrast, planarity and robustness of sound zone methods using a circular loudspeaker array,” The Journal of the Acoustical Society of America 135(4), 1929–1940 (2014).
- D. Caviedes-Nozal, N. A. Riis, F. M. Heuchel, J. Brunskog, P. Gerstoft, and E. Fernandez-Grande, “Gaussian processes for sound field reconstruction,” The Journal of the Acoustical Society of America 149(2), 1107–1119 (2021).
- E. Zea, “Compressed sensing of impulse responses in rooms of unknown properties and contents,” Journal of Sound and Vibration 459, 114871 (2019).
- Y. Haneda, Y. Kaneda, and N. Kitawaki, “Common-acoustical-pole and residue model and its application to spatial interpolation and extrapolation of a room transfer function,” IEEE Transactions on Speech and Audio Processing 7(6), 709–717 (1999).
- R. Mignot, G. Chardon, and L. Daudet, “Low frequency interpolation of room impulse responses using compressed sensing,” IEEE/ACM Transactions on Audio, Speech, and Language Processing 22(1), 205–216 (2013).
- O. Das, P. Calamia, and S. V. A. Gari, “Room impulse response interpolation from a sparse set of measurements using a modal architecture,” in ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE (2021), pp. 960–964.
- Y. Wang and K. Chen, “Compressive sensing based spherical harmonics decomposition of a low frequency sound field within a cylindrical cavity,” The Journal of the Acoustical Society of America 141(3), 1812–1823 (2017).
- N. Ueno, S. Koyama, and H. Saruwatari, “Kernel ridge regression with constraint of helmholtz equation for sound field interpolation,” in 2018 16th International Workshop on Acoustic Signal Enhancement (IWAENC), IEEE (2018), pp. 1–440.
- A. A. F. Durán and E. F. Grande, “Reconstruction of room impulse responses over an extended spatial domain using block-sparse and kernel regression methods,” in 24th International Congress on Acoustics (2022).
- N. Antonello, E. De Sena, M. Moonen, P. A. Naylor, and T. van Waterschoot, “Room impulse response interpolation using a sparse spatio-temporal representation of the sound field,” IEEE/ACM Transactions on Audio, Speech, and Language Processing 25(10), 1929–1941 (2017).
- D. Caviedes-Nozal and E. Fernandez-Grande, “Spatio-temporal bayesian regression for room impulse response reconstruction with spherical waves,” IEEE/ACM Transactions on Audio, Speech, and Language Processing (2023).
- F. Lluís, P. Martínez-Nuevo, M. Bo Møller, and S. Ewan Shepstone, “Sound field reconstruction in rooms: Inpainting meets super-resolution,” The Journal of the Acoustical Society of America 148(2), 649–659 (2020).
- X. Karakonstantis and E. Fernandez Grande, “Sound field reconstruction in rooms with deep generative models,” in INTER-NOISE and NOISE-CON Congress and Conference Proceedings, Institute of Noise Control Engineering (2021), Vol. 263, pp. 1527–1538.
- E. Fernandez-Grande, X. Karakonstantis, D. Caviedes-Nozal, and P. Gerstoft, “Generative models for sound field reconstruction,” The Journal of the Acoustical Society of America 153(2), 1179–1190 (2023).
- E. Fernandez-Grande, D. Caviedes-Nozal, M. Hahmann, X. Karakonstantis, and S. A. Verburg, “Reconstruction of room impulse responses over extended domains for navigable sound field reproduction,” in 2021 Immersive and 3D Audio: from Architecture to Automotive (I3DA), IEEE (2021), pp. 1–8.
- M. Raissi, P. Perdikaris, and G. E. Karniadakis, “Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations,” Journal of Computational physics 378, 686–707 (2019).
- N. Borrel-Jensen, A. P. Engsig-Karup, and C.-H. Jeong, “Physics-informed neural networks for one-dimensional sound field predictions with parameterized sources and impedance boundaries,” JASA Express Letters 1(12), 122402 (2021).
- K. Shigemi, S. Koyama, T. Nakamura, and H. Saruwatari, “Physics-informed convolutional neural network with bicubic spline interpolation for sound field estimation,” in 2022 International Workshop on Acoustic Signal Enhancement (IWAENC), IEEE (2022), pp. 1–5.
- M. Rasht-Behesht, C. Huber, K. Shukla, and G. E. Karniadakis, “Physics-informed neural networks (pinns) for wave propagation and full waveform inversions,” Journal of Geophysical Research: Solid Earth 127(5), e2021JB023120 (2022).
- K. Hornik, M. Stinchcombe, and H. White, “Multilayer feedforward networks are universal approximators,” Neural networks 2(5), 359–366 (1989).
- S. Cuomo, V. S. Di Cola, F. Giampaolo, G. Rozza, M. Raissi, and F. Piccialli, “Scientific machine learning through physics–informed neural networks: Where we are and what’s next,” Journal of Scientific Computing 92(3), 88 (2022).
- M. Costabel and F.-J. Sayas, “Time-dependent problems with the boundary integral equation method,” Encyclopedia of computational mechanics 1, 703–721 (2004).
- M. Pezzoli, D. Perini, A. Bernardini, F. Borra, F. Antonacci, and A. Sarti, “Deep prior approach for room impulse response reconstruction,” Sensors 22(7), 2710 (2022).
- S. Wang, Y. Teng, and P. Perdikaris, “Understanding and mitigating gradient pathologies in physics-informed neural networks,” arXiv preprint arXiv:2001.04536 (2020).
- A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems 30 (2017).
- Z. Xiang, W. Peng, X. Liu, and W. Yao, “Self-adaptive loss balanced physics-informed neural networks,” Neurocomputing 496, 11–34 (2022).
- X. Karakonstantis, “Planar Room Impulse Response Dataset - ACT, DTU Electro (b. 355 r. 008)” (2023), data.dtu.dk/articles/dataset/Planar_Room_Impulse_Response_Dataset_-_ACT_DTU_Electro_b_355_r_008_/21740453, doi: 10.11583/DTU.21740453.v1.
- G. M. Naylor, “Odeon—another hybrid room acoustical model,” Applied Acoustics 38(2-4), 131–143 (1993).
- V. Sitzmann, J. Martel, A. Bergman, D. Lindell, and G. Wetzstein, “Implicit neural representations with periodic activation functions,” Advances in neural information processing systems 33, 7462–7473 (2020).
- N. Benbarka, T. Höfer, A. Zell, et al., “Seeing implicit neural representations as fourier series,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (2022), pp. 2041–2050.
- T. Lokki, M. Grohn, L. Savioja, and T. Takala, “A case study of auditory navigation in virtual acoustic environments,” in Proceedings of Intl. Conf. on Auditory Display (ICAD2000), Citeseer (2000), pp. 145–150.
- J. W. Gibbs, “Fourier’s series,” Nature 59(1522), 200–200 (1898).