Deep Learning in Bioinformatics (1603.06430v5)

Published 21 Mar 2016 in cs.LG and q-bio.GN

Abstract: In the era of big data, transformation of biomedical big data into valuable knowledge has been one of the most important challenges in bioinformatics. Deep learning has advanced rapidly since the early 2000s and now demonstrates state-of-the-art performance in various fields. Accordingly, application of deep learning in bioinformatics to gain insight from data has been emphasized in both academia and industry. Here, we review deep learning in bioinformatics, presenting examples of current research. To provide a useful and comprehensive perspective, we categorize research both by the bioinformatics domain (i.e., omics, biomedical imaging, biomedical signal processing) and deep learning architecture (i.e., deep neural networks, convolutional neural networks, recurrent neural networks, emergent architectures) and present brief descriptions of each study. Additionally, we discuss theoretical and practical issues of deep learning in bioinformatics and suggest future research directions. We believe that this review will provide valuable insights and serve as a starting point for researchers to apply deep learning approaches in their bioinformatics studies.

PDF Abstract

Deep Learning in Bioinformatics

The paper "Deep Learning in Bioinformatics" by Seonwoo Min, Byunghan Lee, and Sungroh Yoon provides a comprehensive overview of the application of deep learning techniques in the field of bioinformatics. This review systematically categorizes research efforts by bioinformatics domains such as omics, biomedical imaging, and biomedical signal processing, as well as by deep learning architectures including deep neural networks (DNNs), convolutional neural networks (CNNs), recurrent neural networks (RNNs), and emergent architectures.

Overview of Deep Learning Architectures

The paper delineates the conceptual frameworks of different deep learning architectures and their bioinformatics applications:

Deep Neural Networks (DNNs): DNNs are noted for their strength in handling high-dimensional data, making them particularly suitable for bioinformatics tasks such as protein structure prediction, gene expression regulation, and protein classification. The autoregressive nature of DNNs enables them to uncover hierarchical representations from large-scale data, which is crucial for extracting meaningful insights from complex biological datasets.
Convolutional Neural Networks (CNNs): CNNs excel at learning spatial hierarchies in data, which has been effectively utilized in biomedical imaging tasks. Their ability to capture local patterns and integrate them into higher-level features makes them particularly useful for tasks like anomaly classification and image segmentation. CNNs have also shown promise in addressing genomic sequence data by detecting motifs and regulatory patterns.
Recurrent Neural Networks (RNNs): RNNs are designed for sequential data analysis, making them ideally suited for tasks involving time-series data such as EEG signals in biomedical signal processing. Their temporal modeling capabilities have been leveraged in gene expression regulation and microRNA target prediction, providing significant performance improvements over traditional methods.
Emergent Architectures: This category includes deep spatio-temporal neural networks (DST-NNs), multi-dimensional RNNs (MD-RNNs), and convolutional autoencoders (CAEs). These architectures aim to refine spatial and temporal correlations progressively and have been applied in specialized bioinformatics tasks, including protein structure prediction and cell image segmentation.

Practical and Theoretical Challenges

The paper identifies several critical challenges in applying deep learning to bioinformatics:

Limited and Imbalanced Data: The scarcity and imbalance of bioinformatics data pose significant hurdles. The paper discusses strategies like data preprocessing, cost-sensitive learning, and algorithmic modifications to mitigate these issues. Unsupervised pre-training and transfer learning are highlighted as effective techniques to leverage available data better.
Interpretability: Given the black-box nature of deep learning models, their interpretability is a major concern, especially in biomedical applications where understanding the rationale behind predictions is crucial. The paper reviews approaches such as visualization through deconvolutional networks and gradient ascent optimization to make these models more transparent.
Selection of Architecture and Hyperparameters: The choice of the appropriate deep learning architecture and hyperparameter tuning are critical for achieving optimal performance. This process has traditionally been empirical, but the future lies in automating hyperparameter optimization to streamline this task.

Future Directions

The paper outlines several promising directions for future research in deep learning for bioinformatics:

Multimodal Deep Learning: Integrating multiple data sources (e.g., genomic data, medical images, and signals) can provide a more comprehensive understanding of biological phenomena. Studies already exploring multimodal deep learning indicate its potential in enhancing prediction accuracy and uncovering novel insights.
Accelerated Learning: The paper discusses the need for advanced optimization algorithms, parallel and distributed computing, and specialized hardware to handle the computational demands of deep learning. Innovations in these areas are vital for making deep learning more accessible and efficient.
Integration of Traditional and Cutting-Edge Methods: Combining traditional deep learning architectures with newer models like those used in attention mechanisms and memory networks could lead to significant advancements. These hybrid models can address complex tasks such as long-range dependencies and intricate logical reasoning required in bioinformatics.

Conclusion

"Deep Learning in Bioinformatics" serves as a valuable resource for researchers in both deep learning and bioinformatics. By providing an extensive review of current research and outlining practical challenges and future directions, the paper offers a solid foundation for future applications and research developments in the field. This review underscores the potential of deep learning in transforming bioinformatics, emphasizing the need for continued innovation and interdisciplinary collaboration.

PDF Markdown Bookmark Chat (Pro)

Authors (3)

Seonwoo Min (10 papers)
Byunghan Lee (13 papers)
Sungroh Yoon (163 papers)

Citations (1,299)

View on Semantic Scholar

Deep Learning in Bioinformatics (1603.06430v5)