Ensemble deep learning: A review (2104.02395v3)

Published 6 Apr 2021 in cs.LG, cs.AI, and cs.CV

Abstract: Ensemble learning combines several individual models to obtain better generalization performance. Currently, deep learning architectures are showing better performance compared to the shallow or traditional models. Deep ensemble learning models combine the advantages of both the deep learning models as well as the ensemble learning such that the final model has better generalization performance. This paper reviews the state-of-art deep ensemble models and hence serves as an extensive summary for the researchers. The ensemble models are broadly categorised into bagging, boosting, stacking, negative correlation based deep ensemble models, explicit/implicit ensembles, homogeneous/heterogeneous ensemble, decision fusion strategies based deep ensemble models. Applications of deep ensemble models in different domains are also briefly discussed. Finally, we conclude this paper with some potential future research directions.

Citations (1,054)

View on Semantic Scholar

Summary

The paper presents a comprehensive review of ensemble deep learning methods, detailing core principles such as bias-variance decomposition and diversity induction.
It categorizes strategies like bagging, boosting, stacking, and implicit/explicit ensembles, providing insights into their theoretical and practical aspects.
The review highlights successful applications in healthcare, speech recognition, image classification, and forecasting, and outlines future research directions.

Review of Ensemble Deep Learning Models

The paper "Ensemble deep learning: A review" by M.A. Ganaie, Minghui Hu, A.K. Malik, M. Tanveer, and P.N. Suganthan offers a detailed analysis of the state-of-the-art in deep ensemble models, focusing on both theoretical foundations and practical applications. The review categorizes ensemble strategies and examines specific techniques such as bagging, boosting, stacking, and more recent advancements like implicit/explicit ensembles. It also explores the use of these models across varied application domains, providing a comprehensive resource for researchers in the field.

Theoretical Framework

The authors begin by elucidating the core principles underpinning the success of ensemble learning, such as bias-variance decomposition, statistical, computational and representational aspects, and diversity induction. The paper underscores how these principles have been instrumental in improving the performance of individual models by leveraging ensemble techniques. The bias-variance-covariance decomposition is particularly notable for demonstrating how ensemble models can achieve lower error rates compared to single models. Furthermore, the exploration of statistical, computational, and representational reasons highlights the multiple dimensions that ensemble methods positively affect.

Ensemble Strategies

Several ensemble strategies are reviewed in the paper, each elucidated with theoretical background and practical considerations:

Bagging:
- Bagging involves generating multiple bootstrap samples from the original dataset and training separate models on each. The predictions are then aggregated using techniques like averaging or majority voting. Random Forest, an extension involving random feature selection at each split, is a prominent example.
Boosting:
- Boosting sequentially trains models, each focusing on the misclassifications of its predecessors. Techniques like AdaBoost and Gradient Boosting have been effectively applied to improve deep models, with models such as Boosted CNNs demonstrating enhanced performance.
Stacking:
- In stacking, multiple models are trained and a meta-model is employed to combine their predictions. This method utilizes the strengths of various base learners, including deep architectures like Deep Convex Networks (DCN) and Deep Stacking Networks (DSN).
Negative Correlation Learning (NCL):
- NCL encourages diversity among base models by introducing a penalty for correlated errors. This approach has been extended to deep learning, improving tasks like crowd counting and robust regression.
Implicit/Explicit Ensembles:
- Implicit methods like Dropout, DropConnect, and Stochastic Depth create an ensemble effect by randomly including/excluding nodes or layers during training. Explicit methods, such as Snapshot Ensembles, save multiple snapshots of a model during training to create an ensemble without significant additional cost.
Homogeneous vs Heterogeneous Ensembles:
- Homogeneous ensembles utilize the same type of base model, while heterogeneous ensembles combine different types, such as integrating traditional machine learning models with deep learning architectures.
Decision Fusion Strategies:
- Various methods for aggregating model outputs are discussed, including unweighted averaging, majority voting, Bayes Optimal Classifier, stacked generalization, and super learner techniques. Each method offers different trade-offs in terms of simplicity, performance, and adaptability.

Applications

The practical utility of deep ensemble models spans multiple domains:

Health Care: The paper illustrates ensemble methods applied to diverse medical tasks such as disease prediction, medical image segmentation, and drug side effect prediction. Techniques like heterogeneous ensembles and implicit/explicit ensembles have shown significant efficacy in improving diagnostic accuracy.
Speech Recognition: Stacking-based ensemble models have been successfully applied to tasks like semantic utterance classification and spoken language understanding, highlighting the potential for ensemble methods to enhance acoustic model performance.
Image Classification: Techniques like Boosted CNNs, implicit ensembles (e.g., Dropout), and deep residual networks have demonstrated robust performance improvements in image recognition tasks.
Forecasting: Ensemble models, including decision fusion and incremental learning approaches, have been effectively utilized for load forecasting and stock trading, leveraging the predictive power of multiple synergistic models.
Other Domains: The review also covers applications in text categorization, emotion recognition, and information retrieval, showcasing the versatility of ensemble deep learning strategies.

Future Directions

The paper identifies several challenging and open issues for future exploration:

Integrating big data techniques with ensemble models to handle high-dimensional, large-scale datasets efficiently.
Exploring randomized learning modules to reduce the complexity and computational cost of deep ensemble models.
Enhancing diversity in implicit/explicit ensembles to improve their generalization capabilities.
Developing criteria for model selection within ensemble architectures to optimize performance across different domains.
Extending ensemble methods to unsupervised learning and deep reinforcement learning to address specific challenges in these areas.

In conclusion, the review provides a thorough examination of deep ensemble learning methods, emphasizing their theoretical foundations, diverse strategies, and impactful applications across various domains. The insights and directions offered pave the way for further advancements and practical implementations in the field of deep ensemble learning.

PDF Markdown

Related Papers

Tweets

https://twitter.com/tribrid2000/status/1746588087105560601