Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Learning to fool the speaker recognition (2004.03434v1)

Published 7 Apr 2020 in eess.AS, cs.CR, and cs.SD

Abstract: Due to the widespread deployment of fingerprint/face/speaker recognition systems, attacking deep learning based biometric systems has drawn more and more attention. Previous research mainly studied the attack to the vision-based system, such as fingerprint and face recognition. While the attack for speaker recognition has not been investigated yet, although it has been widely used in our daily life. In this paper, we attempt to fool the state-of-the-art speaker recognition model and present \textit{speaker recognition attacker}, a lightweight model to fool the deep speaker recognition model by adding imperceptible perturbations onto the raw speech waveform. We find that the speaker recognition system is also vulnerable to the attack, and we achieve a high success rate on the non-targeted attack. Besides, we also present an effective method to optimize the speaker recognition attacker to obtain a trade-off between the attack success rate with the perceptual quality. Experiments on the TIMIT dataset show that we can achieve a sentence error rate of $99.2\%$ with an average SNR $57.2\text{dB}$ and PESQ 4.2 with speed rather faster than real-time.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Jiguo Li (7 papers)
  2. Xinfeng Zhang (44 papers)
  3. Jizheng Xu (10 papers)
  4. Li Zhang (693 papers)
  5. Yue Wang (676 papers)
  6. Siwei Ma (86 papers)
  7. Wen Gao (114 papers)
Citations (20)

Summary

We haven't generated a summary for this paper yet.