Assessing Uncertainty in Similarity Scoring: Performance & Fairness in Face Recognition (2211.07245v2)
Abstract: The ROC curve is the major tool for assessing not only the performance but also the fairness properties of a similarity scoring function. In order to draw reliable conclusions based on empirical ROC analysis, accurately evaluating the uncertainty level related to statistical versions of the ROC curves of interest is absolutely necessary, especially for applications with considerable societal impact such as Face Recognition. In this article, we prove asymptotic guarantees for empirical ROC curves of similarity functions as well as for by-product metrics useful to assess fairness. We also explain that, because the false acceptance/rejection rates are of the form of U-statistics in the case of similarity scoring, the naive bootstrap approach may jeopardize the assessment procedure. A dedicated recentering technique must be used instead. Beyond the theoretical analysis carried out, various experiments using real face image datasets provide strong empirical evidence of the practical relevance of the methods promoted here, when applied to several ROC-based measures such as popular fairness metrics.
- Toward fairness in face matching algorithms. In Proceedings of the 1st International Workshop on Fairness, Accountability, and Transparency in MultiMedia, pp. 19–25, 2019.
- On the Bootstrap of U𝑈Uitalic_U and V𝑉Vitalic_V Statistics. The Annals of Statistics, 20(2):655 – 674, 1992. doi: 10.1214/aos/1176348650. URL https://doi.org/10.1214/aos/1176348650.
- On boostrapping the ROC curve. In Advances in Neural Information Processing Systems 21, pp. 137–144, 2008.
- Some asymptotic theory for the bootstrap. The Annals of Statistics, 9(6):1196–1217, 1981. ISSN 00905364. URL http://www.jstor.org/stable/2240410.
- Mobilefacenets: Efficient cnns for accurate real-time face verification on mobile devices, 2018.
- S. Clémençon and N. Vayatis. Tree-based ranking methods. IEEE Transactions on Information Theory, 55(9):4316–4336, 2009.
- Scaling-up Empirical Risk Minimization: Optimization of Incomplete U𝑈Uitalic_U-statistics. Journal of Machine Learning Research, 17(76):1–36, 2016.
- Mitigating gender bias in face recognition using the von mises-fisher mixture model. In International Conference on Machine Learning, pp. 4344–4369. PMLR, 2022.
- Arcface: Additive angular margin loss for deep face recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4690–4699, 2019a.
- Lightweight face recognition challenge. In 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 2638–2646, 2019b. doi: 10.1109/ICCVW.2019.00322.
- Retinaface: Single-stage dense face localisation in the wild. arXiv preprint arXiv:1905.00641, 2019c.
- Pass: Protected attribute suppression system for mitigating bias in face recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 15087–15096, 2021.
- D.M.Green and J.A. Swets. Signal detection theory and psychophysics. Wiley, 1966.
- B. Efron. Bootstrap methods: another look at the jacknife. Annals of Statistics, 7:1–26, 1979.
- Patrick Grother. Face recognition vendor test (frvt) part 8: Summarizing demographic differentials. 2022.
- Face Recognition Vendor Test (FRVT) — Performance of Automated Gender Classification Algorithms. Technical Report NISTIR 8052, National Institute of Standards and Technology (NIST), 2019.
- Ms-celeb-1m: A dataset and benchmark for large-scale face recognition. In European conference on computer vision, pp. 87–102. Springer, 2016.
- Deep pyramidal residual networks. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jul 2017. doi: 10.1109/cvpr.2017.668. URL http://dx.doi.org/10.1109/CVPR.2017.668.
- von mises-fisher mixture model-based deep learning: Application to face verification. arXiv preprint arXiv:1706.04264, 2017.
- F. Hiesh and B. Turnbull. Nonparametric and semiparametric estimation of the receiver operating characteristic curve. The annals of Statistics, 24:25–40, 1996.
- F. Hsieh and B. Turnbull. Nonparametric and semi-parametric statistical estimation of the ROC curve. The Annals of Statistics, 24:25–40, 1996.
- Curricularface: adaptive curriculum learning loss for deep face recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5901–5910, 2020.
- E. Hüllermeier and W. Waegeman. Aleatoric and epistemic uncertainty in machine learning: an introduction to concepts and methods. Machine Learning, 110(3):457–506, 2021.
- Paul Janssen. Bootstrapping u-statistics. South African Statistical Journal, 31(2):185–216, 1997.
- Minyoung Kim. On pytorch implementation of density estimators for von mises-fisher and its mixture. arXiv preprint arXiv:2102.05340, 2021.
- Alan J. Lee. U𝑈{U}italic_U-statistics: Theory and practice. Marcel Dekker, Inc., New York, 1990.
- George Marsaglia. Choosing a point from the surface of a sphere. Annals of Mathematical Statistics, 43:645–646, 1972. URL https://api.semanticscholar.org/CorpusID:120039893.
- Morph: A longitudinal image database of normal adult age-progression. In 7th international conference on automatic face and gesture recognition (FGR06), pp. 341–345. IEEE, 2006.
- Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 815–823, 2015.
- R. J. Serfling. Approximation theorems of mathematical statistics. Wiley, 1980.
- J. Shao and D. Tu. The Jacknife and bootstrap. Springer, 1995.
- Jacob Snow. Amazon’s Face Recognition Falsely Matched 28 Members of Congress With Mugshots. American Civil Liberties Union, 2018. URL https://www.aclu.org/news/privacy-technology/amazons-face-recognition-falsely-matched-28.
- A. van der Vaart. Asymptotic Statistics. Cambridge University Press, 1998.
- A probabilistic theory of supervised similarity learning for pointwise ROC curve optimization. In Proceedings of the 35th International Conference on Machine Learning, ICML 2018, volume 80 of Proceedings of Machine Learning Research, pp. 5062–5071. PMLR, 2018.
- Cosface: Large margin cosine loss for deep face recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5265–5274, 2018.
- Racial faces in the wild: Reducing racial bias by information maximization adaptation network. In Proceedings of the ieee/cvf international conference on computer vision, pp. 692–702, 2019.
- Adacos: Adaptively scaling cosine logits for effectively learning deep face representations. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 10823–10832, 2019.