Theoretical Framework for the Optimization of Microphone Array Configuration for Humanoid Robot Audition (2401.03286v1)
Abstract: An important aspect of a humanoid robot is audition. Previous work has presented robot systems capable of sound localization and source segregation based on microphone arrays with various configurations. However, no theoretical framework for the design of these arrays has been presented. In the current paper, a design framework is proposed based on a novel array quality measure. The measure is based on the effective rank of a matrix composed of the generalized head related transfer functions (GHRTFs) that account for microphone positions other than the ears. The measure is shown to be theoretically related to standard array performance measures such as beamforming robustness and DOA estimation accuracy. Then, the measure is applied to produce sample designs of microphone arrays. Their performance is investigated numerically, verifying the advantages of array design based on the proposed theoretical framework.
- M. Akhtaruzzaman and A. Shafie, “Evolution of humanoid robot and contribution of various countries in advancing the research and development of the platform,” in ICCAS 2010, Oct. 2010, pp. 1021–1028.
- J. C. Middlebrooks and D. M. Green, “Sound localization by human listeners,” Annu. Rev. Psychol., vol. 42, pp. 135–159, Feb. 1991.
- R. Y. Litovsky, “Spatial release from masking,” Acoustics today, vol. 8, pp. 18–25, Apr. 2012.
- R. L. Robart and L. D. Rosenblum, “Hearing space: Identifying rooms by reflected sound,” in Studies in perception and action VIII, 2005, pp. 153–156.
- F. Keyrouz, W. Maier, and K. Diepold, “A novel humanoid binaural 3D sound localization and separation algorithm,” in 2006 6th IEEE-RAS International Conference on Humanoid Robots, Dec. 2006, pp. 296–301.
- J. Hornstein, M. Lopes, J. Santos-Victor, and F. Lacerda, “Sound localization for humanoid robots - building audio-motor maps based on the HRTF,” in IROS 2006, Oct. 2006, pp. 1170–1176.
- U.-H. Kim and H. G. Okuno, “Improved binaural sound localization and tracking for unknown time-varying number of speakers,” Advanced Robotics, vol. 27, no. 15, pp. 1161–1173, 2013.
- T. Takahashi, K. Nakadai, K. Komatani, T. Ogata, and H. Okuno, “Improvement in listening capability for humanoid robot hrp-2,” in 2010 IEEE International Conference on Robotics and Automation (ICRA), May 2010, pp. 470–475.
- K. Nakamura, K. Nakadai, and H. G. Okuno, “A real-time super-resolution robot audition system that improves the robustness of simultaneous speech recognition,” Advanced Robotics, vol. 27, no. 12, pp. 933–945, 2013.
- M. Maazaoui, K. Abed-Meraim, and Y. Grenier, “Adaptive blind source separation with HRTFs beamforming preprocessing,” in SAM 2012, June 2012, pp. 269–272.
- V. Murino, A. Trucco, and C. Regazzoni, “Synthesis of unequally spaced arrays by simulated annealing,” IEEE Transactions on Signal Processing, vol. 44, no. 1, pp. 119–122, Jan 1996.
- D. Pearson, S. Pillai, and Y. Lee, “An algorithm for near-optimal placement of sensor elements,” IEEE Transactions on Information Theory, vol. 36, no. 6, pp. 1280–1284, Nov 1990.
- X. Zhu and J. R. Buck, “Designing nonuniform linear arrays to maximize mutual information for bearing estimation,” J. Acoust. Soc. Am., vol. 128, no. 5, pp. 2926–2939, 2010.
- S. R. Tuladhar and J. R. Buck, “Optimum array design to maximize fisher information for bearing estimation,” J. Acoust. Soc. Am., vol. 130, no. 5, pp. 2797–2806, 2011.
- S. Joshi and S. Boyd, “Sensor selection via convex optimization,” IEEE Transactions on Signal Processing, vol. 57, no. 2, pp. 451–462, Feb 2009.
- A. Skaf and P. Danes, “Optimal positioning of a binaural sensor on a humanoid head for sound source localization,” in 2011 11th IEEE-RAS International Conference on Humanoid Robots (Humanoids), Oct 2011, pp. 165–170.
- V. Tourbabin and B. Rafaely, “Theoretical framework for the design of microphone arrays for robot audition,” IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4290–4294, 2013.
- O. Roy and M. Vetterli, “The effective rank: A measure of effective dimentionality,” in 15th European Signal Processing Conference (EUSIPCO), Sep. 2007, pp. 606–610.
- V. Tourbabin, M. Agmon, B. Rafaely, and J. Tabrikian, “Optimal real-weighted beamforming with application to linear and spherical arrays,” IEEE Trans. Audio, Speech, Language Process., vol. 20, no. 9, pp. 2575–2585, Nov. 2012.
- R. Schmidt, “Multiple emitter location and signal parameter estimation,” IEEE Trans. Antennas Propag., vol. 34, no. 3, pp. 276–280, Mar. 1986.
- P. Stoica and A. Nehorai, “MUSIC, maximum likelihood, and Cramer-Rao bound,” IEEE Trans. Acoust., Speech, Signal Process., vol. 37, no. 5, pp. 720–741, May 1989.
- E. A. G. Shaw, “Transformation of sound pressure level from the free field to the eardrum in the horizontal plane,” The Journal of the Acoustical Society of America, vol. 56, no. 6, pp. 1848–1861, 1974.
- M. Wax and T. Kailath, “Detection of signals by information theoretic criteria,” IEEE Trans. Acoust. Speech Signal Process., vol. 33, no. 2, pp. 387–392, Apr. 1985.
- E. Fishler and H. Messer, “Order statistics approach for determining the number of sources using an array of sensors,” IEEE Signal Process. Lett., vol. 6, no. 7, pp. 179–182, July 1999.
- D. S. Weile and E. Michielssen, “Genetic algorithm optimization applied to electromagnetics: a review,” IEEE Transactions on Antennas and Propagation, vol. 45, no. 3, pp. 343–353, 1997.
- H. Pessentheiner, G. Kubin, and H. Romsdorfer, “Improving beamforming for distant speech recognition in reverberant environments using a genetic algorithm for planar array synthesis,” in Proceedings of Speech Communication ITG Symposium, Sept 2012, pp. 1–4.
- D. B. Ward and T. Abhayapala, “Range and bearing estimation of wideband sources using an orthogonal beamspace processing structure,” in IEEE International Conference on Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP 2004)., vol. 2, May 2004, pp. ii–109–112.
- D. Khaykin and B. Rafaely, “Coherent signals direction-of-arrival estimation using a spherical microphone array: Frequency smoothing approach,” in IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2009. WASPAA 2009., Oct 2009, pp. 221–224.
- S. Chandler-Wilde and L. Steve, “Boundary element methods for acoustics,” Department of Mathematics, University of Reading, Whiteknights PO Box 220, Tech. Rep., 2007.
- H. A. Schenck, “Improved integral formulation for acoustic radiation problems,” The Journal of the Acoustical Society of America, vol. 44, no. 1, pp. 41–58, 1968.
- R. Greff and B. Katz, “Round robin comparison of HRTF simulation results: Preliminary results,” in Proc. 123rd Convention of the Audio Engineering Society, Oct. 2007, pp. 5–8.
- B. F. G. Katz, “Acoustic absorption measurement of human hair and skin within the audible frequency range,” J. Acoust. Soc. Am., vol. 108, no. 5, 2000.
- R. H. Hardin and N. J. A. Sloane, “Mclaren’s improved snub cube and other new spherical designs in three dimensions,” Discrete and Computational Geometry, vol. 15, pp. 429–441, 1996.