Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A review of ensemble learning and data augmentation models for class imbalanced problems: combination, implementation and evaluation (2304.02858v3)

Published 6 Apr 2023 in cs.LG, cs.AI, and stat.ML

Abstract: Class imbalance (CI) in classification problems arises when the number of observations belonging to one class is lower than the other. Ensemble learning combines multiple models to obtain a robust model and has been prominently used with data augmentation methods to address class imbalance problems. In the last decade, a number of strategies have been added to enhance ensemble learning and data augmentation methods, along with new methods such as generative adversarial networks (GANs). A combination of these has been applied in many studies, and the evaluation of different combinations would enable a better understanding and guidance for different application domains. In this paper, we present a computational study to evaluate data augmentation and ensemble learning methods used to address prominent benchmark CI problems. We present a general framework that evaluates 9 data augmentation and 9 ensemble learning methods for CI problems. Our objective is to identify the most effective combination for improving classification performance on imbalanced datasets. The results indicate that combinations of data augmentation methods with ensemble learning can significantly improve classification performance on imbalanced datasets. We find that traditional data augmentation methods such as the synthetic minority oversampling technique (SMOTE) and random oversampling (ROS) are not only better in performance for selected CI problems, but also computationally less expensive than GANs. Our study is vital for the development of novel models for handling imbalanced datasets.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (164)
  1. J. Bi, C. Zhang, An empirical comparison on state-of-the-art multi-class imbalance learning algorithms and a new diversified ensemble learning scheme, Knowledge-Based Systems 158 (2018) 81–93.
  2. Effective detection of sophisticated online banking fraud on extremely imbalanced data, World Wide Web 16 (2013) 449–475.
  3. Addressing class imbalance in deep learning for small lesion detection on medical images, Computers in biology and medicine 120 (2020) 103735.
  4. A gan-based image synthesis method for skin lesion classification, Computer Methods and Programs in Biomedicine 195 (2020) 105568.
  5. Learning from class-imbalanced data: Review of methods and applications, Expert systems with applications 73 (2017) 220–239.
  6. Preliminary comparison of techniques for dealing with imbalance in software defect prediction, in: Proceedings of the 18th International Conference on Evaluation and Assessment in Software Engineering, 2014, pp. 1–10.
  7. Q. Wei, R. L. Dunbrack Jr, The role of balanced training and testing data sets for binary classifiers in bioinformatics, PloS one 8 (2013) e67863.
  8. M. Buckland, F. Gey, The relationship between recall and precision, Journal of the American society for information science 45 (1994) 12–19.
  9. C. Goutte, E. Gaussier, A probabilistic interpretation of precision, recall and f-score, with implication for evaluation, in: European conference on information retrieval, Springer, 2005, pp. 345–359.
  10. J. M. Johnson, T. M. Khoshgoftaar, Survey on deep learning with class imbalance, Journal of Big Data 6 (2019) 1–54.
  11. Solving the class imbalance problem using ensemble algorithm: application of screening for aortic dissection, BMC Medical Informatics and Decision Making 22 (2022) 1–16.
  12. N. Japkowicz, The class imbalance problem: Significance and strategies, in: Proc. of the Int’l Conf. on artificial intelligence, volume 56, Citeseer, 2000, pp. 111–117.
  13. N. Japkowicz, S. Stephen, The class imbalance problem: A systematic study, Intelligent data analysis 6 (2002) 429–449.
  14. Exploratory undersampling for class-imbalance learning, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 39 (2008) 539–550.
  15. Clustering-based undersampling in class-imbalanced data, Information Sciences 409 (2017) 17–26.
  16. S.-J. Yen, Y.-S. Lee, Under-sampling approaches for improving prediction of the minority class in an imbalanced dataset, in: Intelligent Control and Automation, Springer, 2006, pp. 731–740.
  17. Mute: Majority under-sampling technique, in: 2011 8th International Conference on Information, Communications & Signal Processing, IEEE, 2011, pp. 1–4.
  18. Oversampling the minority class in the feature space, IEEE transactions on neural networks and learning systems 27 (2015) 1947–1961.
  19. A novel synthetic minority oversampling technique for imbalanced data set learning, in: International Conference on Neural Information Processing, Springer, 2011, pp. 735–744.
  20. C. X. Ling, V. S. Sheng, Cost-sensitive learning and the class imbalance problem, Encyclopedia of machine learning 2011 (2008) 231–235.
  21. D. M. J. Tax, One-class classification: Concept learning in the absence of counter-examples. (2002).
  22. One-class classification by combining density and class probability estimation, in: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Springer, 2008, pp. 505–519.
  23. Deep one-class classification, in: International conference on machine learning, PMLR, 2018, pp. 4393–4402.
  24. Synthetic minority oversampling technique for multiclass imbalance problems, Pattern Recognition 72 (2017) 327–340.
  25. An experimental study with imbalanced classification approaches for credit card fraud detection, IEEE Access 7 (2019) 93010–93022.
  26. J. M. Johnson, T. M. Khoshgoftaar, Deep learning and data sampling with imbalanced big data, in: 2019 IEEE 20th international conference on information reuse and integration for data science (IRI), IEEE, 2019, pp. 175–183.
  27. J. Xiao, Svm and knn ensemble learning for traffic incident detection, Physica A: Statistical Mechanics and its Applications 517 (2019) 29–35.
  28. D. P. Solomatine, D. L. Shrestha, Adaboost. rt: a boosting algorithm for regression problems, in: 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No. 04CH37541), volume 2, IEEE, 2004, pp. 1163–1168.
  29. L. Breiman, Random forests, Machine learning 45 (2001) 5–32.
  30. Addressing the class imbalance problem in twitter spam detection using ensemble learning, Computers & Security 69 (2017) 35–49.
  31. Effects of class imbalance on resampling and ensemble learning for improved prediction of cyanobacteria blooms, Ecological Informatics 61 (2021) 101202.
  32. Drug-target interaction prediction via class imbalance-aware ensemble learning, BMC bioinformatics 17 (2016) 267–276.
  33. Efficient sampling techniques for ensemble learning and diagnosing bearing defects under class imbalanced condition, in: 2016 IEEE symposium series on computational intelligence (SSCI), IEEE, 2016, pp. 1–7.
  34. Empowering one-vs-one decomposition with ensemble learning for multi-class imbalanced data, Knowledge-Based Systems 106 (2016) 251–263.
  35. I. D. Mienye, Y. Sun, A survey of ensemble learning: Concepts, algorithms, applications, and prospects, IEEE Access 10 (2022) 99129–99149.
  36. P. Hajek, M. Munk, Speech emotion recognition and text sentiment analysis for financial distress prediction, Neural Computing and Applications (2023) 1–15.
  37. C. Shorten, T. M. Khoshgoftaar, A survey on image data augmentation for deep learning, Journal of Big Data 6 (2019) 1–48.
  38. B. K. Iwana, S. Uchida, An empirical survey of data augmentation for time series classification with neural networks, Plos one 16 (2021) e0254841.
  39. A survey on data augmentation for text classification, ACM Computing Surveys 55 (2022) 1–39.
  40. W. Prachuabsupakij, N. Soonthornphisaj, Clustering and combined sampling approaches for multi-class imbalanced data classification, in: Advances in information technology and industry applications, Springer, 2012, pp. 717–724.
  41. X. Jiang, Z. Ge, Data augmentation classifier for imbalanced fault classification, IEEE Transactions on Automation Science and Engineering 18 (2021) 1206–1217.
  42. Fault detection and classification based on co-training of semisupervised machine learning, IEEE Transactions on Industrial Electronics 65 (2017) 1595–1605.
  43. Real-time condition monitoring and fault detection of components based on machine-learning reconstruction model, Renewable Energy 133 (2019) 433–441.
  44. A distributed sensor-fault detection and diagnosis framework using machine learning, Information Sciences 547 (2021) 777–796.
  45. Protein classification with imbalanced data, Proteins: Structure, function, and bioinformatics 70 (2008) 1125–1132.
  46. A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 42 (2011) 463–484.
  47. R. Polikar, Ensemble learning, in: Ensemble machine learning, Springer, 2012, pp. 1–34.
  48. T. G. Dietterich, et al., Ensemble learning, The handbook of brain theory and neural networks 2 (2002) 110–125.
  49. A comparison of stacking with meta decision trees to bagging, boosting, and stacking with other methods, in: Proceedings 2001 IEEE International Conference on Data Mining, IEEE, 2001, pp. 669–670.
  50. A survey on ensemble learning, Frontiers of Computer Science 14 (2020) 241–258.
  51. Application of bagging, boosting and stacking to intrusion detection, in: International Workshop on Machine Learning and Data Mining in Pattern Recognition, Springer, 2012, pp. 593–602.
  52. Improved landslide assessment using support vector machine with bagging, boosting, and stacking ensemble machine learning framework in a mountainous watershed, japan, Landslides 17 (2020) 641–658.
  53. M. H. D. M. Ribeiro, L. dos Santos Coelho, Ensemble approach based on bagging, boosting and stacking for short-term prediction in agribusiness time series, Applied Soft Computing 86 (2020) 105837.
  54. Roughly balanced bagging for imbalanced data, Statistical Analysis and Data Mining: The ASA Data Science Journal 2 (2009) 412–426.
  55. M. Lango, J. Stefanowski, Multi-class and feature selection extensions of roughly balanced bagging for imbalanced data, Journal of Intelligent Information Systems 50 (2018) 97–127.
  56. Extending bagging for imbalanced data, in: Proceedings of the 8th International Conference on Computer Recognition Systems CORES 2013, Springer International Publishing, Heidelberg, 2013, pp. 269–278.
  57. J. Błaszczyński, J. Stefanowski, Neighbourhood sampling in bagging for imbalanced data, Neurocomputing 150 (2015) 529–542. Special Issue on Information Processing and Machine Learning for Applications of Engineering Solving Complex Machine Learning Problems with Ensemble Methods Visual Analytics using Multidimensional Projections.
  58. Comparing boosting and bagging techniques with noisy and imbalanced data, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans 41 (2011) 552–568.
  59. J. R. Quinlan, Induction of decision trees, Machine learning 1 (1986) 81–106.
  60. P. E. Utgoff, Incremental induction of decision trees, Machine learning 4 (1989) 161–186.
  61. S. Suthaharan, Decision tree learning, in: Machine Learning Models and Algorithms for Big Data Classification, Springer, 2016, pp. 237–269.
  62. Decision trees: an overview and their use in medicine, Journal of medical systems 26 (2002) 445–463.
  63. C. Kingsford, S. L. Salzberg, What are decision trees ?, Nature biotechnology 26 (2008) 1011–1013.
  64. P. Vasudevan, Iterative dichotomiser-3 algorithm in data mining applied to diabetes database, Journal of Computer Science 10 (2014) 1151.
  65. W.-Y. Loh, Classification and regression trees, Wiley interdisciplinary reviews: data mining and knowledge discovery 1 (2011) 14–23.
  66. R. Timofeev, Classification and regression trees (cart) theory and applications, Humboldt University, Berlin 54 (2004).
  67. W.-Y. Loh, Fifty years of classification and regression trees, International Statistical Review 82 (2014) 329–348.
  68. S. B. Kotsiantis, Decision trees: a recent overview, Artificial Intelligence Review 39 (2013) 261–283.
  69. T. K. Ho, Random decision forests, in: Proceedings of 3rd international conference on document analysis and recognition, volume 1, IEEE, 1995, pp. 278–282.
  70. M. Re, G. Valentini, 1 ensemble methods: a review 3 (1).
  71. G. Biau, E. Scornet, A random forest guided tour, Test 25 (2016) 197–227.
  72. Interpretable random forests via rule extraction, in: International Conference on Artificial Intelligence and Statistics, PMLR, 2021, pp. 937–945.
  73. A random forest based predictor for medical data classification using feature ranking, Informatics in Medicine Unlocked 15 (2019) 100180.
  74. Feature ranking for multi-fault diagnosis of rotating machinery by using random forest and KNN, Journal of Intelligent & Fuzzy Systems 34 (2018) 3463–3473.
  75. Extremely randomized trees, Machine learning 63 (2006) 3–42.
  76. M. Belgiu, L. Drăguţ, Random forest in remote sensing: A review of applications and future directions, ISPRS journal of photogrammetry and remote sensing 114 (2016) 24–31.
  77. Random forest algorithm for the classification of neuroimaging data in alzheimer’s disease: a systematic review, Frontiers in aging neuroscience 9 (2017) 329.
  78. Predicting disease risks from highly imbalanced data using random forest, BMC medical informatics and decision making 11 (2011) 1–13.
  79. P. A. A. Resende, A. C. Drummond, A survey of random forest based methods for intrusion detection systems, ACM Computing Surveys (CSUR) 51 (2018) 1–36.
  80. A. More, D. P. Rana, Review of random forest classification techniques to resolve data imbalance, in: 2017 1st International Conference on Intelligent Systems and Information Management (ICISIM), IEEE, 2017, pp. 72–78.
  81. M. J. Siers, M. Z. Islam, Software defect prediction using a cost sensitive decision forest and voting, and a potential solution to the class imbalance problem, Information Systems 51 (2015) 62–71.
  82. Forestexter: An efficient random forest algorithm for imbalanced text categorization, Knowledge-Based Systems 67 (2014) 105–116.
  83. Class weights random forest algorithm for processing class imbalanced medical data, IEEE Access 6 (2018) 4641–4652.
  84. Biased random forest for dealing with the class imbalance problem, IEEE transactions on neural networks and learning systems 30 (2018) 2163–2172.
  85. Experiments with a new boosting algorithm, in: icml, volume 96, Citeseer, 1996, pp. 148–156.
  86. J. H. Friedman, Greedy function approximation: a gradient boosting machine, Annals of statistics (2001) 1189–1232.
  87. T. Chen, C. Guestrin, Xgboost: A scalable tree boosting system, in: Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, 2016, pp. 785–794.
  88. Cusboost: Cluster-based under-sampling with boosting for imbalanced classification, in: 2017 2nd International Conference on Computational Systems and Information Technology for Sustainable Solution (CSITSS), IEEE, 2017, pp. 1–5.
  89. A cluster-based boosting algorithm for bankruptcy prediction in a highly imbalanced dataset, Symmetry 10 (2018) 250.
  90. Wotboost: Weighted oversampling technique in boosting for imbalanced learning, in: 2019 IEEE International Conference on Big Data (Big Data), IEEE, 2019, pp. 2523–2531.
  91. G. Tutz, H. Binder, Generalized additive modeling with implicit variable selection by likelihood-based boosting, Biometrics 62 (2006) 961–971.
  92. Model-based boosting 2.0, The Journal of Machine Learning Research 11 (2010) 2109–2113.
  93. gamboostlss: An r package for model building and variable selection in the gamlss framework, arXiv preprint arXiv:1407.1774 (2014).
  94. R. De Bin, Boosting in cox regression: a comparison between the likelihood-based and the model-based approaches with focus on the r-packages coxboost and mboost, Computational Statistics 31 (2016) 513–531.
  95. Catboost: gradient boosting with categorical features support, arXiv preprint arXiv:1810.11363 (2018).
  96. Feature learning viewpoint of adaboost and a new algorithm, IEEE Access 7 (2019) 149890–149899.
  97. W. Jiang, Process consistency for adaboost, The Annals of Statistics 32 (2004) 13–29.
  98. P. Bartlett, M. Traskin, Adaboost is consistent, Advances in Neural Information Processing Systems 19 (2006).
  99. Multi-class adaboost, Statistics and its Interface 2 (2009) 349–360.
  100. Adaboost with svm-based component classifiers, Engineering Applications of Artificial Intelligence 21 (2008) 785–795.
  101. Regularizing adaboost, Advances in neural information processing systems 11 (1998).
  102. N. C. Oza, Aveboost2: Boosting for noisy data, in: International Workshop on Multiple Classifier Systems, Springer, 2004, pp. 31–40.
  103. Y. Gao, F. Gao, Edited adaboost by weighted knn, Neurocomputing 73 (2010) 3079–3088.
  104. Madaboost: A modification of adaboost, in: COLT, 2000, pp. 180–189.
  105. Ensemble machine learning of random forest, adaboost and xgboost for vertical total electron content forecasting, Remote Sensing 14 (2022) 3547.
  106. Adaboost-based algorithm for network intrusion detection, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 38 (2008) 577–583.
  107. The application of adaboost for distributed, scalable and on-line learning, in: Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, 1999, pp. 362–366.
  108. Bankruptcy forecasting: An empirical comparison of adaboost and neural networks, Decision Support Systems 45 (2008) 110–122.
  109. Ada-whips: explaining adaboost classification with applications in the health sciences, BMC Medical Informatics and Decision Making 20 (2020) 1–25.
  110. K. W. Walker, Z. Jiang, Application of adaptive boosting (adaboost) in demand-driven acquisition (dda) prediction: A machine-learning approach, The Journal of Academic Librarianship 45 (2019) 203–212.
  111. Application of adaboost algorithm in basketball player detection, Acta Polytechnica Hungarica 12 (2015) 189–207.
  112. Adaboost-cnn: An adaptive boosting algorithm for convolutional neural networks to classify multi-class imbalanced datasets using transfer learning, Neurocomputing 404 (2020) 351–366.
  113. Tlusboost algorithm: a boosting solution for class imbalance problem, Soft Computing 23 (2019) 10755–10767.
  114. Redundancy-driven modified tomek-link based undersampling: A solution to class imbalance, Pattern Recognition Letters 93 (2017) 3–12. Pattern Recognition Techniques in Data Mining.
  115. B. Yuan, X. Ma, Sampling + reweighting: Boosting the performance of adaboost on imbalanced datasets, in: The 2012 International Joint Conference on Neural Networks (IJCNN), 2012, pp. 1–6. doi:10.1109/IJCNN.2012.6252738.
  116. Improved PSO AdaBoost ensemble algorithm for imbalanced data, Sensors 19 (2019).
  117. A. Natekin, A. Knoll, Gradient boosting machines, a tutorial, Frontiers in neurorobotics 7 (2013) 21.
  118. E. Walach, L. Wolf, Learning to count with cnn boosting, in: European conference on computer vision, Springer, 2016, pp. 660–676.
  119. Gradient boosting neural networks: Grownet, arXiv preprint arXiv:2002.07971 (2020).
  120. A comparative analysis of gradient boosting algorithms, Artificial Intelligence Review 54 (2021) 1937–1967.
  121. M. D. Zeiler, Adadelta: an adaptive learning rate method, arXiv preprint arXiv:1212.5701 (2012).
  122. Retrieval-based gradient boosting decision trees for disease risk assessment, in: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2022, pp. 3468–3476.
  123. Application of extreme gradient boosting trees in the construction of credit risk assessment models for financial institutions, Applied Soft Computing 73 (2018) 914–920.
  124. I. Semanjski, S. Gautama, Smart city mobility application—gradient boosting trees for mobility prediction and analysis based on crowdsourced data, Sensors 15 (2015) 15974–15987.
  125. Application of gradient boosting algorithms for anti-money laundering in cryptocurrencies, SN Computer Science 2 (2021) 1–15.
  126. K. Oono, T. Suzuki, Optimization and generalization analysis of transduction through gradient boosting and application to multi-scale graph neural networks, Advances in Neural Information Processing Systems 33 (2020) 18917–18930.
  127. Application of gradient boosting machine learning algorithms to predict uniaxial compressive strength of soft sedimentary rocks at thar coalfield, Advances in Civil Engineering 2021 (2021).
  128. Y. Zhang, A. Haghani, A gradient boosting method to improve travel time prediction, Transportation Research Part C: Emerging Technologies 58 (2015) 308–324.
  129. Gradient boosting machine for modeling the energy consumption of commercial buildings, Energy and Buildings 158 (2018) 1533–1543.
  130. Very high resolution object-based land use–land cover urban classification using extreme gradient boosting, IEEE geoscience and remote sensing letters 15 (2018) 607–611.
  131. Predicting tree species presence and basal area in utah: a comparison of stochastic gradient boosting, generalized additive models, and tree-based methods, Ecological modelling 199 (2006) 176–187.
  132. Early prediction of incident liver disease using conventional risk factors and gut-microbiome-augmented gradient boosting, Cell Metabolism 34 (2022) 719–730.
  133. Early prediction of liver disease using conventional risk factors and gut microbiome-augmented gradient boosting, MedRxiv (2020) 2020–06.
  134. G. Shobana, K. Umamaheswari, Prediction of liver disease using gradient boost machine learning techniques with feature scaling, in: 2021 5th International Conference on Computing Methodologies and Communication (ICCMC), IEEE, 2021, pp. 1223–1229.
  135. Real-time object tracking via online discriminative feature selection, IEEE Transactions on Image Processing 22 (2013) 4664–4677.
  136. J. He, B. Thiesson, Asymmetric gradient boosting with application to spam filtering., in: CEAS, 2007.
  137. Class imbalance in gradient boosting classification algorithms: Application to experimental stroke data, Statistical Methods in Medical Research 30 (2021) 916–925.
  138. J. V. Devi, K. Kavitha, Fraud detection in credit card transactions by using classification algorithms, in: 2017 International Conference on Current Trends in Computer, Electrical, Electronics and Communication (CTCEEC), IEEE, 2017, pp. 125–131.
  139. B. A. Tama, K.-H. Rhee, An in-depth experimental study of anomaly detection using gradient boosted machine, Neural Computing and Applications 31 (2019) 955–965.
  140. Xgboost: extreme gradient boosting, R package version 0.4-2 1 (2015) 1–4.
  141. C. S. Bojer, J. P. Meldgaard, Kaggle forecasting competitions: An overlooked learning opportunity, International Journal of Forecasting 37 (2021) 587–603.
  142. Performance analysis of regression algorithms and feature selection techniques to predict pm 2.5 in smart cities, International Journal of System Assurance Engineering and Management (2021) 1–14.
  143. Electricity consumption prediction using xgboost based on discrete wavelet transform, DEStech Transactions on Computer Science and Engineering (2017).
  144. N. Naik, B. R. Mohan, Novel stock crisis prediction technique—a study on indian stock market, IEEE Access 9 (2021) 86230–86242.
  145. Z. Shilong, et al., Machine learning model for sales forecasting by using xgboost, in: 2021 IEEE International Conference on Consumer Electronics and Computer Engineering (ICCECE), IEEE, 2021, pp. 480–483.
  146. L. Zhang, C. Zhan, Machine learning in rock facies classification: An application of xgboost, in: International Geophysical Conference, Qingdao, China, 17-20 April 2017, Society of Exploration Geophysicists and Chinese Petroleum Society, 2017, pp. 1371–1374.
  147. Toward safer highways, application of xgboost and shap for real-time accident detection and feature analysis, Accident Analysis & Prevention 136 (2020) 105405.
  148. B. Pan, Application of xgboost algorithm in hourly pm2. 5 concentration prediction, in: IOP conference series: earth and environmental science, volume 113, IOP publishing, 2018, p. 012127.
  149. A customer churn prediction model based on xgboost and mlp, in: 2020 International Conference on Computer Engineering and Application (ICCEA), IEEE, 2020, pp. 608–612.
  150. Y. ZHUANG, Research on e-commerce customer churn prediction based on improved value model and xg-boost algorithm, Management Science and Engineering 12 (2018) 51–56.
  151. A two-stage hybrid credit risk prediction model based on xgboost and graph-based deep neural network, Expert Systems with Applications 195 (2022) 116624.
  152. Research on personal credit risk evaluation based on xgboost, Procedia Computer Science 199 (2022) 1128–1135.
  153. A hybrid xgboost-mlp model for credit risk assessment on digital supply chain finance, Forecasting 4 (2022) 184–207.
  154. A. Ogunleye, Q.-G. Wang, Xgboost model for chronic kidney disease diagnosis, IEEE/ACM transactions on computational biology and bioinformatics 17 (2019) 2131–2140.
  155. S. Li, X. Zhang, Research on orthopedic auxiliary classification and prediction model based on xgboost algorithm, Neural Computing and Applications 32 (2020) 1971–1979.
  156. Imbalance-xgboost: leveraging weighted and focal losses for binary label-imbalanced classification with xgboost, Pattern Recognition Letters 136 (2020) 190–197.
  157. Xrare: a machine learning method jointly modeling phenotypes and genetic evidence for rare disease diagnosis, Genetics in Medicine 21 (2019) 2126–2134.
  158. Machine learning: applications of artificial intelligence to imaging and diagnosis, Biophysical reviews 11 (2019) 111–118.
  159. Artificial intelligence for unstructured healthcare data: application to coding of patient reporting of adverse drug reactions, Clinical Pharmacology & Therapeutics 110 (2021) 392–400.
  160. J. Hancock, T. M. Khoshgoftaar, Performance of catboost and xgboost in medicare fraud detection, in: 2020 19th IEEE International Conference on Machine Learning and Applications (ICMLA), IEEE, 2020, pp. 572–579.
  161. Research and application of xgboost in imbalanced data, International Journal of Distributed Sensor Networks 18 (2022) 15501329221106935.
  162. Xgboost for imbalanced multiclass classification-based industrial internet of things intrusion detection systems, Sustainability 14 (2022) 8707.
  163. Dtcdwt-smote-xgboost-based islanding detection for distributed generation systems: An approach of class-imbalanced issue, IEEE Systems Journal (2021).
  164. J. Mushava, M. Murray, A novel xgboost extension for credit scoring class-imbalanced data combining a generalized extreme value link and a modified focal loss function, Expert Systems with Applications 202 (2022) 117233.
Citations (74)

Summary

We haven't generated a summary for this paper yet.