a-DCF: an architecture agnostic metric with application to spoofing-robust speaker verification (2403.01355v1)
Abstract: Spoofing detection is today a mainstream research topic. Standard metrics can be applied to evaluate the performance of isolated spoofing detection solutions and others have been proposed to support their evaluation when they are combined with speaker detection. These either have well-known deficiencies or restrict the architectural approach to combine speaker and spoof detectors. In this paper, we propose an architecture-agnostic detection cost function (a-DCF). A generalisation of the original DCF used widely for the assessment of automatic speaker verification (ASV), the a-DCF is designed for the evaluation of spoofing-robust ASV. Like the DCF, the a-DCF reflects the cost of decisions in a Bayes risk sense, with explicitly defined class priors and detection cost model. We demonstrate the merit of the a-DCF through the benchmarking evaluation of architecturally-heterogeneous spoofing-robust ASV solutions.
- A. K. Jain, A. Ross and S. Prabhakar, “An introduction to biometric recognition,” IEEE Transactions on circuits and systems for video technology, vol. 14, no. 1, pp. 4–20, 2004.
- “Speaker recognition by machines and humans: A tutorial review,” IEEE Signal processing magazine, vol. 32, no. 6, pp. 74–99, 2015.
- ISO/IEC 30107-1:2016, “Information technology — biometric presentation attack detection — part 1: Framework,,” https://www.iso.org/obp/ui/#iso:std:iso-iec:30107:-1:ed-1:v1:en, 2016.
- “Joint speaker verification and antispoofing in the i𝑖iitalic_i -vector space,” IEEE Transactions on Information Forensics and Security, vol. 10, no. 4, pp. 821–832, 2015.
- “Integrated presentation attack detection and automatic speaker verification: common features and Gaussian back-end fusion,” in Proc. Interspeech, 2018.
- “On joint optimization of automatic speaker verification and anti-spoofing in the embedding space,” IEEE Transactions on Information Forensics and Security, vol. 16, pp. 1579–1593, 2020.
- “Joint decision of anti-spoofing and automatic speaker verification by multi-task learning with contrastive loss,” IEEE Access, vol. 8, pp. 7907–7915, 2020.
- “Integrated spoofing countermeasures and automatic speaker verification: an evaluation on asvspoof 2015,” in Proc. Interspeech, 2016.
- “Baseline systems for the first spoofing-aware speaker verification challenge: Score and embedding fusion,” in Proc. Speaker Odyssey, 2022.
- P. Porwik, R. Doroz and K. Wrobel, “An ensemble learning approach to lip-based biometric verification, with a dynamic selection of classifiers,” Expert Systems with Applications, vol. 115, pp. 673–683, 2019.
- “ECG based biometric authentication using ensemble of features,” in Proc. Iberian Conference on Information Systems and Technologies (CISTI). IEEE, 2014.
- “Tandem Assessment of Spoofing Countermeasures and Automatic Speaker Verification: Fundamentals,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 28, pp. 2195–2210, 2020.
- “t-EER: Parameter-free tandem evaluation of countermeasures and biometric comparators,” IEEE Transactions on Pattern Analysis & Machine Intelligence, , no. 01, pp. 1–16, 2023.
- “SASV 2022: The first spoofing-aware speaker verification challenge,” in Proc. Interspeech, 2022.
- “Generalizing speaker verification for spoof awareness in the embedding space,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024.
- J. Li, M. Sun and X. Zhang, “Multi-task learning of deep neural networks for joint automatic speaker verification and spoofing detection,” in Proc. APSIPA, 2019.
- “Integrated replay spoofing-aware text-independent speaker verification,” Applied Sciences, vol. 10, no. 18, pp. 6292, 2020.
- “Explore backend ensemble of speaker verification and spoofing countermeasure,” https://sasv-challenge.github.io/pdfs/2022_descriptions/FlySpeech.pdf, [Online; accessed 5-Feb-2024].
- J. Heo, J.-h. Kim and H.-s. Shin, “Two methods for spoofing-aware speaker verification: Multi-layer perceptron score fusion model and integrated embedding projector,” in Proc. Interspeech, 2022.
- “The NIST speaker recognition evaluation – overview, methodology, systems, results, perspective,” Speech Communication, vol. 31, no. 2, pp. 225–254, 2000.
- E. T. Jaynes, Probability theory: The logic of science, Cambridge University Press, Cambridge, 2003.
- R. O. Duda, P. E. Hart and D. G. Stork, Pattern Classification, Wiley, 2 edition, 2001.
- “ASVspoof 2021: Towards spoofed and deepfake speech detection in the wild,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 31, pp. 2507–2522, 2023.
- “The Bosaris toolkit: Theory, algorithms and code for surviving the new dcf,” Proc. NIST SRE’11 Analysis Workshop, 2011.
- A. Nautsch, Speaker recognition in Unconstrained Environments, Ph.D. thesis, Technische Universität Darmstadt, 2019.
- “Pushing the limits of raw waveform speaker recognition,” in Proc. Interspeech, 2022.
- “AASIST: Audio anti-spoofing using integrated spectro-temporal graph attention networks,” in Proc. ICASSP, 2022.
- “ASVspoof 2019: A large-scale public database of synthesized, converted and replayed speech,” Computer Speech & Language, vol. 64, pp. 101114, 2020.
- “ID R&D team submission description for sasv challenge 2022,” https://sasv-challenge.github.io/pdfs/2022_descriptions/IDVoice.pdf, [Online; accessed 5-Feb-2024].
- “The DKU-OPPO system for the 2022 spoofing-aware speaker verification challenge,” in Proc. Interspeech, 2022.
- “HYU submission for the SASV challenge 2022: Reforming speaker embeddings with spoofing-aware conditioning,” in Proc. Interspeech, 2022.
- “Towards single integrated spoofing-aware speaker verification embeddings,” in Proc. Interspeech, 2023.
- “MFA-conformer: Multi-scale feature aggregation conformer for automatic speaker verification,” in Proc. Interspeech, 2022.
- “Frequency and multi-scale selective kernel attention for speaker verification,” in Proc. SLT, 2023.