Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Modified Group Delay Based MultiPitch Estimation in Co-Channel Speech (1603.05435v1)

Published 17 Mar 2016 in cs.SD

Abstract: Phase processing has been replaced by group delay processing for the extraction of source and system parameters from speech. Group delay functions are ill-behaved when the transfer function has zeros that are close to unit circle in the z-domain. The modified group delay function addresses this problem and has been successfully used for formant and monopitch estimation. In this paper, modified group delay functions are used for multipitch estimation in concurrent speech. The power spectrum of the speech is first flattened in order to annihilate the system characteristics, while retaining the source characteristics. Group delay analysis on this flattened spectrum picks the predominant pitch in the first pass and a comb filter is used to filter out the estimated pitch along with its harmonics. The residual spectrum is again analyzed for the next candidate pitch estimate in the second pass. The final pitch trajectories of the constituent speech utterances are formed using pitch grouping and post processing techniques. The performance of the proposed algorithm was evaluated on standard datasets using two metrics; pitch accuracy and standard deviation of fine pitch error. Our results show that the proposed algorithm is a promising pitch detection method in multipitch environment for real speech recordings.

Citations (2)

Summary

We haven't generated a summary for this paper yet.