Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Toward Scalable Machine Learning and Data Mining: the Bioinformatics Case (1710.00112v1)

Published 29 Sep 2017 in cs.DC, cs.LG, and stat.ML

Abstract: In an effort to overcome the data deluge in computational biology and bioinformatics and to facilitate bioinformatics research in the era of big data, we identify some of the most influential algorithms that have been widely used in the bioinformatics community. These top data mining and machine learning algorithms cover classification, clustering, regression, graphical model-based learning, and dimensionality reduction. The goal of this study is to guide the focus of scalable computing experts in the endeavor of applying new storage and scalable computation designs to bioinformatics algorithms that merit their attention most, following the engineering maxim of "optimize the common case".

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Faraz Faghri (6 papers)
  2. Sayed Hadi Hashemi (8 papers)
  3. Mohammad Babaeizadeh (16 papers)
  4. Mike A. Nalls (5 papers)
  5. Saurabh Sinha (25 papers)
  6. Roy H. Campbell (7 papers)
Citations (4)