The History Began from AlexNet: A Comprehensive Survey on Deep Learning Approaches (1803.01164v2)

Published 3 Mar 2018 in cs.CV

Abstract: Deep learning has demonstrated tremendous success in variety of application domains in the past few years. This new field of machine learning has been growing rapidly and applied in most of the application domains with some new modalities of applications, which helps to open new opportunity. There are different methods have been proposed on different category of learning approaches, which includes supervised, semi-supervised and un-supervised learning. The experimental results show state-of-the-art performance of deep learning over traditional machine learning approaches in the field of Image Processing, Computer Vision, Speech Recognition, Machine Translation, Art, Medical imaging, Medical information processing, Robotics and control, Bio-informatics, NLP, Cyber security, and many more. This report presents a brief survey on development of DL approaches, including Deep Neural Network (DNN), Convolutional Neural Network (CNN), Recurrent Neural Network (RNN) including Long Short Term Memory (LSTM) and Gated Recurrent Units (GRU), Auto-Encoder (AE), Deep Belief Network (DBN), Generative Adversarial Network (GAN), and Deep Reinforcement Learning (DRL). In addition, we have included recent development of proposed advanced variant DL techniques based on the mentioned DL approaches. Furthermore, DL approaches have explored and evaluated in different application domains are also included in this survey. We have also comprised recently developed frameworks, SDKs, and benchmark datasets that are used for implementing and evaluating deep learning approaches. There are some surveys have published on Deep Learning in Neural Networks [1, 38] and a survey on RL [234]. However, those papers have not discussed the individual advanced techniques for training large scale deep learning models and the recently developed method of generative models [1].

PDF Abstract

Comprehensive Survey on Deep Learning Approaches Since AlexNet

Introduction

The paper "The History Began from AlexNet: A Comprehensive Survey on Deep Learning Approaches" provides an extensive overview of the advances in deep learning (DL) methodologies since the advent of AlexNet, a pivotal model in the DL community. Since the resurgence of neural networks, spurred by the breakthrough with AlexNet in 2012, there has been a substantial evolution in terms of architectures, training techniques, applications, and theoretical insights. This survey categorizes these advancements across supervised, unsupervised, and reinforcement learning paradigms while underscoring recent developments in training optimization and hardware implementations.

Summary of Key Architectures

The paper begins with a discussion on various DL architectures such as Deep Neural Networks (DNN), Convolutional Neural Networks (CNN), and Recurrent Neural Networks (RNN), extending to their variants like Long Short Term Memory (LSTM) and Gated Recurrent Units (GRU). The survey highlights performance metrics and architectural innovations that have contributed to state-of-the-art results across several domains like image processing, speech recognition, and NLP.

AlexNet to Residual Networks

AlexNet demonstrated unprecedented performance on the ImageNet dataset, introducing techniques like ReLU activation and dropout regularization. This model paved the way for subsequent architectures such as ZFNet, which optimized hyperparameters of AlexNet, and VGG, which emphasized depth with uniform convolution kernel sizes. Residual Networks (ResNet), introduced by He et al., addressed the vanishing gradient problem inherent in deeper networks through skip connections, allowing architectures with over a thousand layers, significantly enhancing accuracy in image classification tasks.

Generative Models and Auto-Encoders

The paper reviews several generative models including Auto-Encoders (AE), Variational Auto-Encoders (VAE), and Generative Adversarial Networks (GAN). GANs, in particular, have gained prominence due to their ability to generate high-fidelity data samples, thus proving instrumental in unsupervised learning tasks. These models are broadly applied in image generation, style transfer, and anomaly detection.

Advances in Training Techniques

Efficient training of DL models has seen innovations such as Batch Normalization, which accelerates training by normalizing layer inputs, and various initialization methods like Xavier and He initialization, which address issues of vanishing/exploding gradients. Optimization techniques have also evolved, with algorithms like Adam and RMSprop providing adaptive learning rates that enhance convergence.

Regularization and Optimization

Regularization methods such as Dropout and DropConnect have been pivotal in preventing overfitting by randomly deactivating neurons during training. Furthermore, advancements in optimization algorithms, particularly Adam and its variants, have optimized the process by adjusting learning rates dynamically, thereby improving model robustness and convergence speeds.

Theoretical and Practical Implications

The survey outlines the theoretical advancements in deep learning, emphasizing the universal applicability of DL approaches to various domains. Practically, these methodologies have shown extraordinary success in high-complexity tasks such as autonomous driving, medical imaging analytics, and real-time object detection in videos.

Energy Efficiency and Hardware Implementations

With the exponential growth of DL model sizes, energy-efficient implementations have gained considerable attention. Techniques like binary and ternary weight networks aim to reduce the computational burden, making it feasible to deploy sophisticated models on energy-constrained devices. The paper references various custom hardware solutions like Google's Tensor Processing Units (TPUs) and MIT's Eyeriss, designed explicitly to accelerate DL computations while minimizing power consumption.

Applications across Domains

The application of DL approaches spans a wide array of fields. For instance, in medical imaging, CNNs and GANs have been employed for tasks ranging from tumor segmentation to synthetic data generation for training enhancement. In NLP, RNNs and their variants have demonstrated excellence in machine translation and sentiment analysis. DRL, merging DL with reinforcement learning principles, has revolutionized fields like robotics and game playing, exemplified by models like AlphaGo.

Future Directions

The paper projects several promising future directions for DL research. These include enhancing unsupervised and semi-supervised learning methods, improving the interpretability of deep models, and developing more energy-efficient architectures suitable for mobile and embedded systems. Moreover, the integration of DL with quantum computing stands as a potential frontier, promising to tackle even more complex problems with greater computational efficiency.

Conclusion

This comprehensive survey presents an invaluable compilation of advancements in deep learning since AlexNet, structured around model architectures, innovative training methods, practical applications, and future research directions. The paper encapsulates the trajectory of deep learning, from foundational concepts to cutting-edge developments, highlighting the profound impact and limitless potential of these approaches across numerous scientific and engineering domains.

PDF Markdown Bookmark Chat (Pro)

Authors (9)

Md Zahangir Alom (18 papers)
Tarek M. Taha (17 papers)
Christopher Yakopcic (1 paper)
Stefan Westberg (1 paper)
Paheding Sidike (4 papers)
Mst Shamima Nasrin (3 papers)
Brian C Van Esesn (1 paper)
Abdul A S. Awwal (1 paper)
Vijayan K. Asari (18 papers)

Citations (824)

View on Semantic Scholar

Related Papers

Find Related Papers

YouTube

Show All Videos