An Efficient Algorithm for Clustering of Large-Scale Mass Spectrometry Data

Published 4 Jan 2013 in cs.DS, q-bio.GN, and q-bio.QM | (1301.0834v1)

Abstract: High-throughput spectrometers are capable of producing data sets containing thousands of spectra for a single biological sample. These data sets contain a substantial amount of redundancy from peptides that may get selected multiple times in a LC-MS/MS experiment. In this paper, we present an efficient algorithm, CAMS (Clustering Algorithm for Mass Spectra) for clustering mass spectrometry data which increases both the sensitivity and confidence of spectral assignment. CAMS utilizes a novel metric, called F-set, that allows accurate identification of the spectra that are similar. A graph theoretic framework is defined that allows the use of F-set metric efficiently for accurate cluster identifications. The accuracy of the algorithm is tested on real HCD and CID data sets with varying amounts of peptides. Our experiments show that the proposed algorithm is able to cluster spectra with very high accuracy in a reasonable amount of time for large spectral data sets. Thus, the algorithm is able to decrease the computational time by compressing the data sets while increasing the throughput of the data by interpreting low S/N spectra.

Abstract PDF Upgrade to Chat

Citations (13)

View on Semantic Scholar

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Paper Prompts

Top Community Prompts

Explain it Like I'm 14

off on

Knowledge Gaps

off on

Practical Applications

off on

Glossary

off on

Conceptual Simplification

off on

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Generate Now

Continue Learning

We haven't generated follow-up questions for this paper yet.

Generate Now

An Efficient Algorithm for Clustering of Large-Scale Mass Spectrometry Data

Summary

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Open Problems

Continue Learning

Authors (4)

Collections

An Efficient Algorithm for Clustering of Large-Scale Mass Spectrometry Data

Summary

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Open Problems

Continue Learning

Related Papers

Authors (4)

Collections