Transformer-based de novo peptide sequencing for data-independent acquisition mass spectrometry (2402.11363v3)

Published 17 Feb 2024 in q-bio.QM and cs.AI

Abstract: Tandem mass spectrometry (MS/MS) stands as the predominant high-throughput technique for comprehensively analyzing protein content within biological samples. This methodology is a cornerstone driving the advancement of proteomics. In recent years, substantial strides have been made in Data-Independent Acquisition (DIA) strategies, facilitating impartial and non-targeted fragmentation of precursor ions. The DIA-generated MS/MS spectra present a formidable obstacle due to their inherent high multiplexing nature. Each spectrum encapsulates fragmented product ions originating from multiple precursor peptides. This intricacy poses a particularly acute challenge in de novo peptide/protein sequencing, where current methods are ill-equipped to address the multiplexing conundrum. In this paper, we introduce DiaTrans, a deep-learning model based on transformer architecture. It deciphers peptide sequences from DIA mass spectrometry data. Our results show significant improvements over existing STOA methods, including DeepNovo-DIA and PepNet. Casanovo-DIA enhances precision by 15.14% to 34.8%, recall by 11.62% to 31.94% at the amino acid level, and boosts precision by 59% to 81.36% at the peptide level. Integrating DIA data and our DiaTrans model holds considerable promise to uncover novel peptides and more comprehensive profiling of biological samples. Casanovo-DIA is freely available under the GNU GPL license at https://github.com/Biocomputing-Research-Group/DiaTrans.

References (20)

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Tweets

https://twitter.com/PastelBio/status/1783526385014759684

https://twitter.com/PastelBio/status/1760590251523817529

https://twitter.com/bioinfo_papers/status/1759823829894111612

https://twitter.com/XTXI/status/1806208828289024306

https://twitter.com/XTXI/status/1759871952892669968

Transformer-based de novo peptide sequencing for data-independent acquisition mass spectrometry (2402.11363v3)

Summary

Related Papers

Tweets