Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Device Feature based on Graph Fourier Transformation with Logarithmic Processing For Detection of Replay Speech Attacks (2404.17280v1)

Published 26 Apr 2024 in cs.SD and eess.AS

Abstract: The most common spoofing attacks on automatic speaker verification systems are replay speech attacks. Detection of replay speech heavily relies on replay configuration information. Previous studies have shown that graph Fourier transform-derived features can effectively detect replay speech but ignore device and environmental noise effects. In this work, we propose a new feature, the graph frequency device cepstral coefficient, derived from the graph frequency domain using a device-related linear transformation. We also introduce two novel representations: graph frequency logarithmic coefficient and graph frequency logarithmic device coefficient. We evaluate our methods using traditional Gaussian mixture model and light convolutional neural network systems as classifiers. On the ASVspoof 2017 V2, ASVspoof 2019 physical access, and ASVspoof 2021 physical access datasets, our proposed features outperform known front-ends, demonstrating their effectiveness for replay speech detection.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (35)
  1. “An overview of text-independent speaker recognition: From features to supervectors,” Speech Communication, vol. 52, no. 1, pp. 12–40, 2010.
  2. “Spoofing and countermeasures for speaker verification: A survey,” Speech Communication, vol. 66, pp. 130 – 153, 2015.
  3. “The attacker’s perspective on automatic speaker verification: An overview,” in Interspeech 2020, 2020, pp. 4213–4217.
  4. “Voice conversion challenge 2020: Intra-lingual semi-parallel and cross-lingual voice conversion,” in ISCA Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020, 2020, pp. 80–98.
  5. “Predictions of subjective ratings and spoofing assessments of voice conversion challenge 2020 submissions,” in ISCA Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020, 2020, pp. 99–120.
  6. “Automatic versus human speaker verification: The case of voice mimicry,” Speech Communication, vol. 72, pp. 13 – 31, 2015.
  7. “On the study of replay and voice conversion attacks to text-dependent speaker verification,” Multimedia Tools and Applications, vol. 75, no. 9, pp. 5311–5327, May 2016.
  8. “A new feature for automatic speaker verification anti-spoofing: Constant Q cepstral coefficients,” in Odyssey 2016, 2016, pp. 283–290.
  9. “Constant Q cepstral coefficients: A spoofing countermeasure for automatic speaker verification,” Computer Speech & Language, vol. 45, pp. 516–535, 2017.
  10. “Combining evidences from mel cepstral, cochlear filter cepstral and instantaneous frequency features for detection of natural vs. spoofed speech,” in Interspeech 2015, 2015, pp. 2062–2066.
  11. “Device features based on linear transformation with parallel training data for replay speech detection,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 31, pp. 1574–1586, 2023.
  12. “Device feature extraction based on parallel neural network training for replay spoofing detection,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 28, pp. 2308–2318, 2020.
  13. “A novel feature based on graph signal processing for detection of physical access attacks,” in Proceedings of the Speaker Odyssey 2022, 2022.
  14. “The ASVspoof 2017 challenge: Assessing the limits of replay spoofing attack detection,” in Interspeech 2017, 2017, pp. 2–6.
  15. “ASVspoof 2017 Version 2.0: meta-data analysis and baseline enhancements,” in Odyssey 2018, 2018, pp. 296–303.
  16. “ASVspoof 2019: Future horizons in spoofed and fake audio detection,” in Interspeech 2019, 2019, pp. 1008–1012.
  17. “ASVspoof 2021: accelerating progress in spoofed and deepfake speech detection,” in ASVspoof 2021 Workshop, 2021, pp. 47–54.
  18. “Front-end factor analysis for speaker verification,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 19, no. 4, pp. 788–798, May 2011.
  19. “TC-DTW: Accelerating multivariate dynamic time warping through triangle inequality and point clustering,” 2021.
  20. “ASVspoof 2019: A large-scale public database of synthesized, converted and replayed speech,” Computer Speech & Language, vol. 64, pp. 101114, 2020.
  21. “Asvspoof 2021: Towards spoofed and deepfake speech detection in the wild,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. PP, pp. 1–14, 01 2023.
  22. “Tandem assessment of spoofing countermeasures and automatic speaker verification: Fundamentals,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 28, pp. 2195–2210, 2020.
  23. S. Furui, “Cepstral analysis technique for automatic speaker verification,” IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 29, no. 2, pp. 254–272, Apr 1981.
  24. “A comparative study on neural architectures and training methods for Japanese speech recognition,” in Proc. Interspeech 2021, 2021, pp. 2092–2096.
  25. “Adam: A method for stochastic optimization,” 2014.
  26. “Replay detection using CQT-based modified group delay feature and ResNeWt network in ASVspoof 2019,” in Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2019, pp. 540–545.
  27. “Cross-Domain replay spoofing attack detection using domain adversarial training,” in Proc. Interspeech 2019, 2019, pp. 2938–2942.
  28. “Front-end factor analysis for speaker verification,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 19, no. 4, pp. 788–798, 2011.
  29. “A deep one-class learning method for replay attack detection,” in Proc. Interspeech 2022, 2022, pp. 4765–4769.
  30. “STC antispoofing systems for the ASVspoof2019 Challenge,” in Proc. Interspeech 2019, 2019, pp. 1033–1037.
  31. “Modified magnitude-phase spectrum information for spoofing detection,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 29, pp. 1065–1078, 2021.
  32. “ASVspoof 2021: accelerating progress in spoofed and deepfake speech detection,” in ASVspoof 2021 Workshop - Automatic Speaker Verification and Spoofing Coutermeasures Challenge, Virtual, France, Sept. 2021.
  33. “The DKU-CMRI system for the ASVspoof 2021 Challenge: Vocoder based replay channel response estimation,” in Proc. 2021 Edition of the Automatic Speaker Verification and Spoofing Countermeasures Challenge, 2021, pp. 16–21.
  34. “The Biometric Vox System for the ASVspoof 2021 Challenge,” in Proc. 2021 Edition of the Automatic Speaker Verification and Spoofing Countermeasures Challenge, 2021, pp. 68–74.
  35. “Dnn controlled adaptive front-end for replay attack detection systems,” Speech Communication, vol. 154, 2023.

Summary

We haven't generated a summary for this paper yet.