Real-time Speech Extraction Using Spatially Regularized Independent Low-rank Matrix Analysis and Rank-constrained Spatial Covariance Matrix Estimation
Abstract: Real-time speech extraction is an important challenge with various applications such as speech recognition in a human-like avatar/robot. In this paper, we propose the real-time extension of a speech extraction method based on independent low-rank matrix analysis (ILRMA) and rank-constrained spatial covariance matrix estimation (RCSCME). The RCSCME-based method is a multichannel blind speech extraction method that demonstrates superior speech extraction performance in diffuse noise environments. To improve the performance, we introduce spatial regularization into the ILRMA part of the RCSCME-based speech extraction and design two regularizers. Speech extraction experiments demonstrated that the proposed methods can function in real time and the designed regularizers improve the speech extraction performance.
- “Blind spatial subtraction array for speech enhancement in noisy environment,” IEEE Transaction on Audio, Speech, and Language Processing, vol. 17, no. 4, pp. 650–664, 2009.
- “Blind speech extraction based on rank-constrained spatial covariance matrix estimation with multivariate generalized Gaussian distribution,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 28, pp. 1948–1963, 2020.
- “Determined blind source separation unifying independent vector analysis and nonnegative matrix factorization,” IEEE/ACM Transactions on Audio, Speech, Language Processing, vol. 24, pp. 1626–1641, 2016.
- “Determined blind source separation with independent low-rank matrix analysis,” Audio Source Separation, pp. 125–155, 2018.
- “A review of blind source separation methods: Two converging routes to ILRMA originating from ICA and NMF,” APSIPA Transactions on Signal and Information Processing, vol. 8, no. e12, pp. 1–14, 2019.
- “Equivalence between frequency-domain blind source separation and frequency-domain adaptive beamforming for convolutive mixtures,” EURASIP Journal on Advances in Signal Processing, vol. 2003, no. 11, pp. 1–10, 2003.
- “Blind source separation for moving speech signals using blockwise ICA and residual crosstalk subtraction,” IEICE Transaction on Fundamentals of Electronics, Communications and Computer Sciences, vol. E87-A, pp. 1941–1948, 2004.
- “Blind separation of acoustic signals combining SIMO-model-based independent component analysis and binary masking,” EURASIP Journal on Advances in Signal Processing, vol. 2006, no. 034970, 2006.
- “Performance improvement of higher-order ICA using lerning period detection based on closed-form second-order ICA and kurtosis,” in Proceedings of International Workshop on Acoustic Signal Enhancement (IWAENC), 2008.
- “Vectorwise coordinate descent algorithm for spatially regularized independent low-rank matrix analysis,” in Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2018, pp. 746–750.
- “Learning the parts of objects by non-negative matrix factorization,” Nature, vol. 401, no. 6755, pp. 788–791, 1999.
- D. R. Hunter and K. Lange, “Quantile regression via an MM algorithm,” Journal of Computational and Graphical Statistics, vol. 9, no. 1, pp. 60–77, 2000.
- N. Ono, “Stable and fast update rules for independent vector analysis based on auxiliary function technique,” in Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2011, pp. 189–192.
- L. Li and K. Koishida, “Geometrically constrained independent vector analysis for directional speech enhancement,” in Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processings (ICASSP), 2020, pp. 846–850.
- “Independent deeply learned matrix analysis with automatic selection of stable microphone-wise update and fast sourcewise update of demixing matrix,” Signal Processing, vol. 178, pp. 107753, 2021.
- “An approach to blind source separation based on temporal structure of speech signals,” Neurocomputing, vol. 41, no. 1-4, pp. 1–24, 2001.
- “Geometrically constrained independent vector analysis with auxiliary function approach and iterative source steering,” in Proceedings of European Signal Processing Conference (EUSIPCO), 2022, pp. 757–761.
- “JSUT and JVS: Free Japanese voice corpora for accelerating speech synthesis research,” Acoustical Science and Technology, vol. 41, no. 5, pp. 761–768, 2020.
- “Performance measurement in blind audio source separation,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 14, no. 4, pp. 1462–1469, 2006.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.