Dice Question Streamline Icon: https://streamlinehq.com

Statistical parameterization of f0 and formant estimation

Identify and formalize the underlying parameters that are estimated when computing fundamental frequency f0 and formant frequencies from speech recordings by developing a probabilistic model of speech signals in which f0 and formant frequencies are explicit, identifiable parameters amenable to statistical estimation and inference.

Information Square Streamline Icon: https://streamlinehq.com

Background

Many phonetic studies rely on automatic or semi-automatic estimates of fundamental frequency and formants, whose accuracy is often checked manually. The prevailing perspective largely comes from signal processing, and a rigorous statistical framework specifying what these estimators target is lacking.

The authors explicitly mark this as an open modeling question and call for a statistical formulation that makes f0 and formants explicit parameters within a probabilistic model of the speech signal, enabling principled estimation and uncertainty quantification.

References

Some open statistical modeling questions in phonetics are the following. Many studies are based on the fundamental frequency $f_0$ and the formant frequencies. The accuracy of those measurements cannot be taken for granted, so they are manually checked in many studies. However the perspective taken is mostly a signal-processing one, and a more statistical approach is lacking: what underlying parameter is being estimated when computing $f_0$ and formant frequencies?

Statistics in Phonetics (2404.07567 - Tavakoli et al., 11 Apr 2024) in Section 6 (Conclusions and open problems in phonetics research)