Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

An Ensemble Generation Method Based on Instance Hardness (1804.07419v2)

Published 20 Apr 2018 in cs.LG, cs.AI, and stat.ML

Abstract: In Machine Learning, ensemble methods have been receiving a great deal of attention. Techniques such as Bagging and Boosting have been successfully applied to a variety of problems. Nevertheless, such techniques are still susceptible to the effects of noise and outliers in the training data. We propose a new method for the generation of pools of classifiers based on Bagging, in which the probability of an instance being selected during the resampling process is inversely proportional to its instance hardness, which can be understood as the likelihood of an instance being misclassified, regardless of the choice of classifier. The goal of the proposed method is to remove noisy data without sacrificing the hard instances which are likely to be found on class boundaries. We evaluate the performance of the method in nineteen public data sets, and compare it to the performance of the Bagging and Random Subspace algorithms. Our experiments show that in high noise scenarios the accuracy of our method is significantly better than that of Bagging.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Felipe N. Walmsley (1 paper)
  2. George D. C. Cavalcanti (24 papers)
  3. Dayvid V. R. Oliveira (3 papers)
  4. Rafael M. O. Cruz (39 papers)
  5. Robert Sabourin (47 papers)
Citations (15)

Summary

We haven't generated a summary for this paper yet.