Voice-Indistinguishability: Protecting Voiceprint in Privacy-Preserving Speech Data Release (2004.07442v1)
Abstract: With the development of smart devices, such as the Amazon Echo and Apple's HomePod, speech data have become a new dimension of big data. However, privacy and security concerns may hinder the collection and sharing of real-world speech data, which contain the speaker's identifiable information, i.e., voiceprint, which is considered a type of biometric identifier. Current studies on voiceprint privacy protection do not provide either a meaningful privacy-utility trade-off or a formal and rigorous definition of privacy. In this study, we design a novel and rigorous privacy metric for voiceprint privacy, which is referred to as voice-indistinguishability, by extending differential privacy. We also propose mechanisms and frameworks for privacy-preserving speech data release satisfying voice-indistinguishability. Experiments on public datasets verify the effectiveness and efficiency of the proposed methods.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.