A Comparison of Differential Performance Metrics for the Evaluation of Automatic Speaker Verification Fairness (2404.17810v1)
Abstract: When decisions are made and when personal data is treated by automated processes, there is an expectation of fairness -- that members of different demographic groups receive equitable treatment. This expectation applies to biometric systems such as automatic speaker verification (ASV). We present a comparison of three candidate fairness metrics and extend previous work performed for face recognition, by examining differential performance across a range of different ASV operating points. Results show that the Gini Aggregation Rate for Biometric Equitability (GARBE) is the only one which meets three functional fairness measure criteria. Furthermore, a comprehensive evaluation of the fairness and verification performance of five state-of-the-art ASV systems is also presented. Our findings reveal a nuanced trade-off between fairness and verification accuracy underscoring the complex interplay between system design, demographic inclusiveness, and verification reliability.
- Oubaida Chouchane (3 papers)
- Christoph Busch (106 papers)
- Chiara Galdi (3 papers)
- Nicholas Evans (73 papers)
- Massimiliano Todisco (55 papers)