Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Speaker De-identification System using Autoencoders and Adversarial Training (2011.04696v1)

Published 9 Nov 2020 in cs.SD, cs.CL, and eess.AS

Abstract: The fast increase of web services and mobile apps, which collect personal data from users, increases the risk that their privacy may be severely compromised. In particular, the increasing variety of spoken language interfaces and voice assistants empowered by the vertiginous breakthroughs in Deep Learning are prompting important concerns in the European Union to preserve speech data privacy. For instance, an attacker can record speech from users and impersonate them to get access to systems requiring voice identification. Hacking speaker profiles from users is also possible by means of existing technology to extract speaker, linguistic (e.g., dialect) and paralinguistic features (e.g., age) from the speech signal. In order to mitigate these weaknesses, in this paper, we propose a speaker de-identification system based on adversarial training and autoencoders in order to suppress speaker, gender, and accent information from speech. Experimental results show that combining adversarial learning and autoencoders increase the equal error rate of a speaker verification system while preserving the intelligibility of the anonymized spoken content.

Citations (15)

Summary

We haven't generated a summary for this paper yet.