Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
113 tokens/sec
GPT-4o
12 tokens/sec
Gemini 2.5 Pro Pro
37 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
3 tokens/sec
DeepSeek R1 via Azure Pro
33 tokens/sec
2000 character limit reached

Achieving More with Less: A Tensor-Optimization-Powered Ensemble Method (2408.02936v2)

Published 6 Aug 2024 in cs.LG

Abstract: Ensemble learning is a method that leverages weak learners to produce a strong learner. However, obtaining a large number of base learners requires substantial time and computational resources. Therefore, it is meaningful to study how to achieve the performance typically obtained with many base learners using only a few. We argue that to achieve this, it is essential to enhance both classification performance and generalization ability during the ensemble process. To increase model accuracy, each weak base learner needs to be more efficiently integrated. It is observed that different base learners exhibit varying levels of accuracy in predicting different classes. To capitalize on this, we introduce confidence tensors $\tilde{\mathbf{\Theta}}$ and $\tilde{\mathbf{\Theta}}_{rst}$ signifies the degree of confidence that the $t$-th base classifier assigns the sample to class $r$ while it actually belongs to class $s$. To the best of our knowledge, this is the first time an evaluation of the performance of base classifiers across different classes has been proposed. The proposed confidence tensor compensates for the strengths and weaknesses of each base classifier in different classes, enabling the method to achieve superior results with a smaller number of base learners. To enhance generalization performance, we design a smooth and convex objective function that leverages the concept of margin, making the strong learner more discriminative. Furthermore, it is proved that in gradient matrix of the loss function, the sum of each column's elements is zero, allowing us to solve a constrained optimization problem using gradient-based methods. We then compare our algorithm with random forests of ten times the size and other classical methods across numerous datasets, demonstrating the superiority of our approach.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (23)
  1. A random forest guided tour. Test, 25: 197–227.
  2. Convex optimization. Cambridge university press.
  3. Log-sum-exp neural networks and posynomial models for convex and log-log-convex data. IEEE transactions on neural networks and learning systems, 31(3): 827–838.
  4. Ensemble deep learning in bioinformatics. Nature Machine Intelligence, 2(9): 500–508.
  5. Incremental and Decremental Optimal Margin Distribution Learning. In IJCAI, 3523–3531.
  6. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, 785–794.
  7. Tensor representations of conformal algebra and conformally covariant operator product expansion. Annals of Physics, 76(1): 161–188.
  8. Unified analysis of stochastic gradient methods for composite convex and smooth optimization. Journal of Optimization Theory and Applications, 199(2): 499–540.
  9. Beyond max-margin: Class margin equilibrium for few-shot object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 7363–7372.
  10. Spectral ensemble clustering. In Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, 715–724.
  11. A nonlinear kernel SVM classifier via L0/1 soft-margin loss with classification performance. Journal of Computational and Applied Mathematics, 437: 115471.
  12. Dropout reduces underfitting. In International Conference on Machine Learning, 22233–22248. PMLR.
  13. Evolutionary bagging for ensemble learning. Neurocomputing, 510: 1–14.
  14. Multi-class support vector machine with maximizing minimum margin. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, 14466–14473.
  15. When multi-view classification meets ensemble learning. Neurocomputing, 490: 17–29.
  16. An improved random forest based on the classification accuracy and correlation measurement of decision trees. Expert Systems with Applications, 237: 121549.
  17. Townsend, J. T. 1971. Theoretical analysis of an alphabetic confusion matrix. Perception & Psychophysics, 9: 40–50.
  18. Dimensionality reduced training by pruning and freezing parts of a deep neural network: a survey. Artificial Intelligence Review, 56(12): 14257–14295.
  19. A unified theory of diversity in ensemble learning. Journal of Machine Learning Research, 24(359): 1–49.
  20. Adaptivity and non-stationarity: Problem-dependent dynamic regret for online convex optimization. Journal of Machine Learning Research, 25(98): 1–52.
  21. Zhou, Z.-H. 2014. Large margin distribution learning. In Artificial Neural Networks in Pattern Recognition: 6th IAPR TC 3 International Workshop, ANNPR 2014, Montreal, QC, Canada, October 6-8, 2014. Proceedings 6, 1–11. Springer.
  22. Deep forest. National science review, 6(1): 74–86.
  23. Ensemble machine learning paradigms in hydrology: A review. Journal of Hydrology, 598: 126266.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets