Papers
Topics
Authors
Recent
AI Research Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 81 tok/s
Gemini 2.5 Pro 42 tok/s Pro
GPT-5 Medium 23 tok/s Pro
GPT-5 High 20 tok/s Pro
GPT-4o 103 tok/s Pro
Kimi K2 188 tok/s Pro
GPT OSS 120B 454 tok/s Pro
Claude Sonnet 4 38 tok/s Pro
2000 character limit reached

iMagLS: Interaural Level Difference with Magnitude Least-Squares Loss for Optimized First-Order Head-Related Transfer Function (2311.16702v1)

Published 28 Nov 2023 in eess.AS and cs.SD

Abstract: Binaural reproduction for headphone-based listening is an active research area due to its widespread use in evolving technologies such as augmented and virtual reality (AR and VR). On the one hand, these applications demand high quality spatial audio perception to preserve the sense of immersion. On the other hand, recording devices may only have a few microphones, leading to low-order representations such as first-order Ambisonics (FOA). However, first-order Ambisonics leads to limited externalization and spatial resolution. In this paper, a novel head-related transfer function (HRTF) preprocessing optimization loss is proposed, and is minimized using nonlinear programming. The new method, denoted iMagLS, involves the introduction of an interaural level difference (ILD) error term to the now widely used MagLS optimization loss for the lateral plane angles. Results indicate that the ILD error could be substantially reduced, while the HRTF magnitude error remains similar to that obtained with MagLS. These results could prove beneficial to the overall spatial quality of first-order Ambisonics, while other reproduction methods could also benefit from considering this modified loss.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (13)
  1. M. A. Gerzon, “Periphony: With-height sound reproduction,” Journal of the Audio Engineering Society, vol. 21, no. 1, pp. 2–10, 1973.
  2. A. Avni, J. Ahrens, M. Geier, S. Spors, H. Wierstorf, and B. Rafaely, “Spatial perception of sound fields recorded by spherical microphone arrays with varying spatial resolution,” The Journal of the Acoustical Society of America (JASA), vol. 133, no. 5, pp. 2711–2721, 2013.
  3. B. Rafaely, V. Tourbabin, E. Habets, Z. Ben-Hur, H. Lee, H. Gamper, L. Arbel, L. Birnie, T. Abhayapala, and P. Samarasinghe, “Spatial audio signal processing for binaural reproduction of recorded acoustic scenes–review and challenges,” Acta Acustica, vol. 6, p. 47, 2022.
  4. Z. Ben-Hur, F. Brinkmann, J. Sheaffer, S. Weinzierl, and B. Rafaely, “Spectral equalization in binaural signals represented by order-truncated spherical harmonics,” JASA, vol. 141, no. 6, pp. 4087–4096, 2017.
  5. M. Zaunschirm, C. Schörkhuber, and R. Höldrich, “Binaural rendering of ambisonic signals by head-related impulse response time alignment and a diffuseness constraint,” JASA, vol. 143, no. 6, pp. 3616–3627, 2018.
  6. Z. Ben-Hur, D. L. Alon, R. Mehra, and B. Rafaely, “Efficient representation and sparse sampling of head-related transfer functions using phase-correction based on ear alignment,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 27, no. 12, pp. 2249–2262, 2019.
  7. C. Schörkhuber, M. Zaunschirm, and R. Höldrich, “Binaural rendering of ambisonic signals via magnitude least squares,” in Proceedings of the DAGA, vol. 44, 2018, pp. 339–342.
  8. B. Rafaely and A. Avni, “Interaural cross correlation in a sound field represented by spherical harmonics,” JASA, vol. 127, no. 2, pp. 823–828, 2010.
  9. C. G. Broyden, “The convergence of a class of double-rank minimization algorithms 1. general considerations,” IMA Journal of Applied Mathematics, vol. 6, no. 1, pp. 76–90, 1970.
  10. M. Burkhard and R. Sachs, “Anthropometric manikin for acoustic research,” JASA, vol. 58, no. 1, pp. 214–222, 1975.
  11. J. Vilkamo, T. Bäckström, and A. Kuntz, “Optimized covariance domain framework for time–frequency processing of spatial audio,” Journal of the Audio Engineering Society, vol. 61, no. 6, pp. 403–411, 2013.
  12. A. W. Mills, “Lateralization of high-frequency tones,” JASA, vol. 32, no. 1, pp. 132–134, 1960.
  13. W. A. Yost and R. H. Dye Jr, “Discrimination of interaural differences of level as a function of frequency,” JASA, vol. 83, no. 5, pp. 1846–1851, 1988.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube