Unifying and extending Precision Recall metrics for assessing generative models (2405.01611v1)
Abstract: With the recent success of generative models in image and text, the evaluation of generative models has gained a lot of attention. Whereas most generative models are compared in terms of scalar values such as Frechet Inception Distance (FID) or Inception Score (IS), in the last years (Sajjadi et al., 2018) proposed a definition of precision-recall curve to characterize the closeness of two distributions. Since then, various approaches to precision and recall have seen the light (Kynkaanniemi et al., 2019; Naeem et al., 2020; Park & Kim, 2023). They center their attention on the extreme values of precision and recall, but apart from this fact, their ties are elusive. In this paper, we unify most of these approaches under the same umbrella, relying on the work of (Simon et al., 2019). Doing so, we were able not only to recover entire curves, but also to expose the sources of the accounted pitfalls of the concerned metrics. We also provide consistency results that go well beyond the ones presented in the corresponding literature. Last, we study the different behaviors of the curves obtained experimentally.
- Precision recall cover: A method for assessing generative models. In International Conference on Artificial Intelligence and Statistics, pp. 6571–6594. PMLR, 2023.
- A probabilistic theory of pattern recognition, volume 31. Springer Science & Business Media, 2013.
- Precision-recall curves using information divergence frontiers. In International Conference on Artificial Intelligence and Statistics, pp. 2550–2559. PMLR, 2020.
- Rate of convergence of k𝑘kitalic_k-nearest-neighbor classification rule. Journal of Machine Learning Research, 18(227):1–16, 2018.
- Ghosh, A. K. On optimum choice of k in nearest neighbor classification. Computational Statistics & Data Analysis, 50(11):3113–3123, 2006.
- Optimal smoothing in kernel discriminant analysis. Statistica Sinica, pp. 457–483, 2004.
- Universal consistency and rates of convergence of multiclass prototype algorithms in metric spaces. The Journal of Machine Learning Research, 22(1):6702–6726, 2021.
- Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems, 30, 2017.
- Emergent asymmetry of precision and recall for measuring fidelity and diversity of generative models in high dimensions. In Proceedings of the 40th International Conference on Machine Learning, ICML’23. JMLR.org, 2023.
- Improved precision and recall metric for assessing generative models. Advances in Neural Information Processing Systems, 32, 2019.
- Evaluating generative networks using gaussian mixtures of image features. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 279–288, 2023.
- Reliable fidelity and diversity metrics for generative models. In International Conference on Machine Learning, pp. 7176–7185. PMLR, 2020.
- Probabilistic precision and recall towards reliable evaluation of generative models. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 20099–20109, 2023.
- Assessing generative models via precision and recall. Advances in neural information processing systems, 31, 2018.
- Improved techniques for training gans. Advances in neural information processing systems, 29, 2016.
- Revisiting precision recall definition for generative modeling. In International Conference on Machine Learning, pp. 5799–5808. PMLR, 2019.
- On the theoretical equivalence of several trade-off curves assessing statistical proximity. Journal of Machine Learning Research, 24(185):1–34, 2023. URL http://jmlr.org/papers/v24/21-0607.html.
- Benjamin Sykes (1 paper)
- Loic Simon (14 papers)
- Julien Rabin (15 papers)