Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
158 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Analytical model for the relation between signal bandwidth and spatial resolution in Steered-Response Power Phase Transform (SRP-PHAT) maps (2402.06586v1)

Published 9 Feb 2024 in cs.SD, eess.AS, and eess.SP

Abstract: An analysis of the relationship between the bandwidth of acoustic signals and the required resolution of steered-response power phase transform (SRP-PHAT) maps used for sound source localization is presented. This relationship does not rely on the far-field assumption, nor does it depend on any specific array topology. The proposed analysis considers the computation of a SRP map as a process of sampling a set of generalized cross-correlation (GCC) functions, each one corresponding to a different microphone pair. From this approach, we derive a rule that relates GCC bandwidth with inter-microphone distance, resolution of the SRP map, and the potential position of the sound source relative to the array position. This rule is a sufficient condition for an aliasing-free calculation of the specified SRP-PHAT map. Simulation results show that limiting the bandwidth of the GCC according to such rule leads to significant reductions in sound source localization errors when sources are not in the immediate vicinity of the microphone array. These error reductions are more relevant for coarser resolutions of the SRP map, and they happen in both anechoic and reverberant environments.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (28)
  1. M. Crocco, M. Cristani, A. Trucco, and V. Murino, “Audio surveillance: A systematic review,” ACM Comput. Surv., vol. 48, no. 4, pp. 52:1 – 52:46, 2016.
  2. J. H. DiBiase, H. F. Silverman, and M. S. Brandstein, “Robust localization in reverberant rooms,” in Microphone Arrays.   Springer, 2001, pp. 157–180.
  3. C. Knapp and G. Carter, “The generalized correlation method for estimation of time delay,” IEEE Trans. Acoust., Speech, Signal Processing, vol. 24, no. 4, pp. 320–327, 1976.
  4. D. N. Zotkin and R. Duraiswami, “Accelerated speech source localization via a hierarchical search of steered response power,” IEEE Trans. Speech Audio Processing, vol. 12, no. 5, pp. 499–508, 2004.
  5. A. Martí, M. Cobos, J. J. Lopez, and J. Escolano, “A steered response power iterative method for high-accuracy acoustic source localization,” J. Acoust. Soc. Amer., vol. 134, no. 4, pp. 2627–2630, 2013.
  6. L. O. Nunes, W. A. Martins, M. V. S. Lima, L. W. P. Biscainho, M. V. M. Costa, F. M. Gonçalves, A. Said, and B. Lee, “A steered-response power algorithm employing hierarchical search for acoustic source localization using microphone arrays,” IEEE Trans. Signal Processing, vol. 62, no. 19, pp. 5171–5183, 2014.
  7. M. V. S. Lima, W. A. Martins, L. O. Nunes, L. W. P. Biscainho, T. N. Ferreira, M. V. M. Costa, and B. Lee, “A volumetric SRP with refinement step for sound source localization,” IEEE Signal Processing Lett., vol. 22, no. 8, pp. 1098–1102, 2015.
  8. D. Yook, T. Lee, and Y. Cho, “Fast sound source localization using two-level search space clustering,” IEEE Transactions on Cybernetics, vol. 46, no. 1, pp. 20–26, 2016.
  9. D. Salvati, C. Drioli, and G. L. Foresti, “Sensitivity-based region selection in the steered response power algorithm,” Signal Process., vol. 153, pp. 1–10, 2018.
  10. S. Astapov, J. Berdnikova, and J. S. Preden, “Optimized acoustic localization with SRP-PHAT for monitoring in distributed sensor networks,” Internat. J. Electron. Telecommun., vol. 59, no. 4, pp. 383–390, 2015.
  11. M. A. Awad-Alla, A. Hamdy, F. A. Tolbah, M. A. Shahin, and M. A. Abdelaziz, “A two-stage approach for passive sound source localization based on the SRP-PHAT algorithm,” APSIPA Trans. Signal Inform. Processing, vol. 9, pp. e8:1–e8:12, 2020.
  12. H. Do, H. F. Silverman, and Y. Yu, “A real-time SRP-PHAT source location implementation using stochastic region contraction (SRC) on a large-aperture microphone array,” in IEEE Internat. Conf. Acoust. Speech, & Signal Process., vol. I, 2007, pp. 121–124.
  13. M. Cobos, A. Martí, and J. J. Lopez, “A modified SRP-PHAT functional for robust real-time sound source localization with scalable spatial sampling,” IEEE Signal Processing Lett., vol. 18, no. 1, pp. 71–74, 2011.
  14. D. Salvati, C. Drioli, and G. L. Foresti, “Exploiting a geometrically sampled grid in the steered response power algorithm for localization improvement,” J. Acoust. Soc. Amer., vol. 141, no. 1, pp. 586–601, 2017.
  15. D. Díaz-Guerra, A. Miguel, and J. R. Beltrán, “Robust sound source tracking using SRP-PHAT and 3D convolutional neural networks,” IEEE/ACM Trans. Audio, Speech, Language Processing, vol. 29, pp. 300–311, 2021.
  16. J. M. Vera-Díaz, D. Pizarro, and J. Macías-Guarasa, “Acoustic source localization with deep generalized cross correlations,” Signal Process., vol. 187, p. 108169, 2021.
  17. A. D. Firoozabadi, P. Irarrazaval, P. Adasme, H. Durney, and M. S. Olave, “A novel quasi-spherical nested microphone array and multiresolution modified SRP by gammatone filterbank for multiple speakers localization,” in IEEE Internat. Conf. Signal Process.: Algorithms, Architectures, Arrangements, and App., 2019, pp. 208–2–13.
  18. H. F. Silverman and W. R. Patterson, “Visualizing the performance of large-aperture microphone arrays,” in IEEE Internat. Conf. Acoust. Speech, & Signal Process., vol. 2, 1999, pp. 969–972.
  19. J. McDonough and K. Kumatani, “Microphone arrays,” in Techniques for Noise Robustness in Automatic Speech Recognition.   Wiley, 2012, pp. 109–157.
  20. J. Velasco, C. J. Martín-Arguedas, J. Macías-Guarasa, D. Pizarro, and M. Mazo, “Proposal and validation of an analytical generative model of SRP-PHAT power maps in reverberant scenarios,” Signal Process., vol. 119, pp. 209–228, 2016.
  21. C. Zhang, D. Florêncio, D. E. Ba, and Z. Zhang, “Maximum likelihood sound source localization and beamforming for directional microphone arrays in distributed meetings,” IEEE Trans. Multimedia, vol. 10, no. 3, pp. 538–548, 2008.
  22. J. P. Dmochowski, J. Benesty, and S. Affes, “A generalized steered response power method for computationally viable source localization,” IEEE Trans. Audio, Speech, Language Processing, vol. 15, no. 8, pp. 2510–2526, 2007.
  23. Y. Cho, D. Yook, S. Chang, and H. Kim, “Sound source localization for robot auditory systems,” IEEE Trans. Consumer Electron., vol. 55, no. 3, pp. 1663–1668, 2009.
  24. “Task 2 - Sound event detection in synthetic audio,” 2016, DCASE 2016 Challenge. [Online]. Available: http://www.cs.tut.fi/sgn/arg/dcase2016/challenge
  25. J. M. Gutiérrez-Arriola, R. Fraile, A. Camacho, T. Durand, J. L. Jarrín, and S. R. Mendoza, “Synthetic sound event detection based on MFCC,” in Proc. of DCASE 2016 Workshop, 2016, pp. 30–34.
  26. J. Cowan, “Building acoustics,” in Handbook of acoustics, T. Rossing, Ed.   Springer, 2007, pp. 387–425.
  27. J. B. Allen and D. A. Berkley, “Image method for efficiently simulating small‐room acoustics,” J. Acoust. Soc. Amer., vol. 65, no. 4, pp. 943–950, 1979.
  28. E. A. P. Habets, “Room impulse response generator,” Technische Universiteit Eindhoven, Tech. Rep., 2006.
Citations (8)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com