Unsupervised Estimation of Ensemble Accuracy (2311.10940v2)
Abstract: Ensemble learning combines several individual models to obtain a better generalization performance. In this work we present a practical method for estimating the joint power of several classifiers. It differs from existing approaches which focus on "diversity" measures by not relying on labels. This makes it both accurate and practical in the modern setting of unsupervised learning with huge datasets. The heart of the method is a combinatorial bound on the number of mistakes the ensemble is likely to make. The bound can be efficiently approximated in time linear in the number of samples. We relate the bound to actual misclassifications, hence its usefulness as a predictor of performance. We demonstrate the method on popular large-scale face recognition datasets which provide a useful playground for fine-grain classification tasks using noisy data over many classes.
- Approximation schemes for scheduling on parallel machines. Journal of Scheduling, 1:55–66, December 1998.
- Digiface-1m: 1 million digital face images for face recognition. In 2023 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 2023.
- A survey on ensemble learning. Frontiers of Computer Science, 14(2):241–258, April 2020.
- A desicion-theoretic generalization of on-line learning and an application to boosting. In Paul Vitányi, editor, Computational Learning Theory, pages 23–37, 1995.
- Ensemble deep learning: A review. Engineering Applications of Artificial Intelligence, 115:105151, 2022.
- Ron L. Graham. Bounds on multiprocessing timing anomalies. SIAM Journal on Applied Mathematics, 2(17):416–429, March 1969.
- Davis E. King. Dlib c++ library. http://dlib.net. Version 19.24; Accessed: 2023-08-31.
- Davis E. King. Dlib-ml: A machine learning toolkit. Journal of Machine Learning Research, 10:1755–1758, 2009.
- Richard E. Korf. Multi-way number partitioning. In IJCAI-09, pages 538–543, 2009.
- Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machine Learning, 51:181––207, 2003.
- Deep learning face attributes in the wild. In Proceedings of International Conference on Computer Vision (ICCV), December 2015.
- The degree sequence of a random graph. I. The models. Random Structures & Algorithms, 11(2):97–117, September 1997.
- Stephan Mertens. The easiest hard problem: Number partitioning. In Allon G. Percus, Gabriel Istrate, and Cristopher Moore, editors, Computational Complexity and Statistical Physics, pages 125–139. Oxford University Press, 2006.
- Robi Polikar. Ensemble based systems in decision making. IEEE Circuits and Systems Magazine, 6(3):21–45, 2006.
- Lior Rokach. Ensemble-based classifiers. Artif. Intell. Rev., 33:1–39, 02 2010.
- Shareboost: Efficient multiclass learning with feature sharing, 2011.
- Learning with ensembles: How overfitting can be useful. Advances in Neural Information Processing Systems, 8:190–196, 1996.
- Sharing visual features for multiclass and multiview object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(5):854–869, 2007.
- Dataset cleaning – a cross validation methodology for large facial datasets using face recognition, 2020.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.