iMagLS: Interaural Level Difference with Magnitude Least-Squares Loss for Optimized First-Order Head-Related Transfer Function (2311.16702v1)
Abstract: Binaural reproduction for headphone-based listening is an active research area due to its widespread use in evolving technologies such as augmented and virtual reality (AR and VR). On the one hand, these applications demand high quality spatial audio perception to preserve the sense of immersion. On the other hand, recording devices may only have a few microphones, leading to low-order representations such as first-order Ambisonics (FOA). However, first-order Ambisonics leads to limited externalization and spatial resolution. In this paper, a novel head-related transfer function (HRTF) preprocessing optimization loss is proposed, and is minimized using nonlinear programming. The new method, denoted iMagLS, involves the introduction of an interaural level difference (ILD) error term to the now widely used MagLS optimization loss for the lateral plane angles. Results indicate that the ILD error could be substantially reduced, while the HRTF magnitude error remains similar to that obtained with MagLS. These results could prove beneficial to the overall spatial quality of first-order Ambisonics, while other reproduction methods could also benefit from considering this modified loss.
- M. A. Gerzon, “Periphony: With-height sound reproduction,” Journal of the Audio Engineering Society, vol. 21, no. 1, pp. 2–10, 1973.
- A. Avni, J. Ahrens, M. Geier, S. Spors, H. Wierstorf, and B. Rafaely, “Spatial perception of sound fields recorded by spherical microphone arrays with varying spatial resolution,” The Journal of the Acoustical Society of America (JASA), vol. 133, no. 5, pp. 2711–2721, 2013.
- B. Rafaely, V. Tourbabin, E. Habets, Z. Ben-Hur, H. Lee, H. Gamper, L. Arbel, L. Birnie, T. Abhayapala, and P. Samarasinghe, “Spatial audio signal processing for binaural reproduction of recorded acoustic scenes–review and challenges,” Acta Acustica, vol. 6, p. 47, 2022.
- Z. Ben-Hur, F. Brinkmann, J. Sheaffer, S. Weinzierl, and B. Rafaely, “Spectral equalization in binaural signals represented by order-truncated spherical harmonics,” JASA, vol. 141, no. 6, pp. 4087–4096, 2017.
- M. Zaunschirm, C. Schörkhuber, and R. Höldrich, “Binaural rendering of ambisonic signals by head-related impulse response time alignment and a diffuseness constraint,” JASA, vol. 143, no. 6, pp. 3616–3627, 2018.
- Z. Ben-Hur, D. L. Alon, R. Mehra, and B. Rafaely, “Efficient representation and sparse sampling of head-related transfer functions using phase-correction based on ear alignment,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 27, no. 12, pp. 2249–2262, 2019.
- C. Schörkhuber, M. Zaunschirm, and R. Höldrich, “Binaural rendering of ambisonic signals via magnitude least squares,” in Proceedings of the DAGA, vol. 44, 2018, pp. 339–342.
- B. Rafaely and A. Avni, “Interaural cross correlation in a sound field represented by spherical harmonics,” JASA, vol. 127, no. 2, pp. 823–828, 2010.
- C. G. Broyden, “The convergence of a class of double-rank minimization algorithms 1. general considerations,” IMA Journal of Applied Mathematics, vol. 6, no. 1, pp. 76–90, 1970.
- M. Burkhard and R. Sachs, “Anthropometric manikin for acoustic research,” JASA, vol. 58, no. 1, pp. 214–222, 1975.
- J. Vilkamo, T. Bäckström, and A. Kuntz, “Optimized covariance domain framework for time–frequency processing of spatial audio,” Journal of the Audio Engineering Society, vol. 61, no. 6, pp. 403–411, 2013.
- A. W. Mills, “Lateralization of high-frequency tones,” JASA, vol. 32, no. 1, pp. 132–134, 1960.
- W. A. Yost and R. H. Dye Jr, “Discrimination of interaural differences of level as a function of frequency,” JASA, vol. 83, no. 5, pp. 1846–1851, 1988.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Collections
Sign up for free to add this paper to one or more collections.