Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
158 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Transformer-based de novo peptide sequencing for data-independent acquisition mass spectrometry (2402.11363v3)

Published 17 Feb 2024 in q-bio.QM and cs.AI

Abstract: Tandem mass spectrometry (MS/MS) stands as the predominant high-throughput technique for comprehensively analyzing protein content within biological samples. This methodology is a cornerstone driving the advancement of proteomics. In recent years, substantial strides have been made in Data-Independent Acquisition (DIA) strategies, facilitating impartial and non-targeted fragmentation of precursor ions. The DIA-generated MS/MS spectra present a formidable obstacle due to their inherent high multiplexing nature. Each spectrum encapsulates fragmented product ions originating from multiple precursor peptides. This intricacy poses a particularly acute challenge in de novo peptide/protein sequencing, where current methods are ill-equipped to address the multiplexing conundrum. In this paper, we introduce DiaTrans, a deep-learning model based on transformer architecture. It deciphers peptide sequences from DIA mass spectrometry data. Our results show significant improvements over existing STOA methods, including DeepNovo-DIA and PepNet. Casanovo-DIA enhances precision by 15.14% to 34.8%, recall by 11.62% to 31.94% at the amino acid level, and boosts precision by 59% to 81.36% at the peptide level. Integrating DIA data and our DiaTrans model holds considerable promise to uncover novel peptides and more comprehensive profiling of biological samples. Casanovo-DIA is freely available under the GNU GPL license at https://github.com/Biocomputing-Research-Group/DiaTrans.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (20)
  1. A. L. McCormack, D. M. Schieltz, B. Goode, S. Yang, G. Barnes, D. Drubin, and J. R. Yates, “Direct analysis and identification of proteins in mixtures by lc/ms/ms and database searching at the low-femtomole level,” Analytical chemistry, vol. 69, no. 4, pp. 767–776, 1997.
  2. C. Fernández-Costa, S. Martínez-Bartolomé, D. B. McClatchy, A. J. Saviola, N.-K. Yu, and J. R. Yates III, “Impact of the identification strategy on the reproducibility of the dda and dia results,” Journal of proteome research, vol. 19, no. 8, pp. 3153–3161, 2020.
  3. C. L. Hunter, J. Bons, and B. Schilling, “Perspectives and opinions from scientific leaders on the evolution of data-independent acquisition for quantitative proteomics and novel biological applications,” Australian Journal of Chemistry, 2023.
  4. D. Beslic, G. Tscheuschner, B. Y. Renard, M. G. Weller, and T. Muth, “Comprehensive evaluation of peptide de novo sequencing tools for monoclonal antibody assembly,” Briefings in Bioinformatics, vol. 24, no. 1, p. bbac542, 2023.
  5. C. Bartels, “Fast algorithm for peptide sequencing by mass spectroscopy,” Biomedical & environmental mass spectrometry, vol. 19, no. 6, pp. 363–368, 1990.
  6. Y. Yan, S. Zhang, and F.-X. Wu, “Applications of graph theory in protein structure identification,” Proteome science, vol. 9, pp. 1–10, 2011.
  7. B. Ma, K. Zhang, C. Hendrie, C. Liang, M. Li, A. Doherty-Kirby, and G. Lajoie, “Peaks: powerful software for peptide de novo sequencing by tandem mass spectrometry,” Rapid communications in mass spectrometry, vol. 17, no. 20, pp. 2337–2342, 2003.
  8. B. Ma, “Novor: real-time peptide de novo sequencing software,” Journal of the American Society for Mass Spectrometry, vol. 26, no. 11, pp. 1885–1894, 2015.
  9. N. H. Tran, X. Zhang, L. Xin, B. Shan, and M. Li, “De novo peptide sequencing by deep learning,” Proceedings of the National Academy of Sciences, vol. 114, no. 31, pp. 8247–8252, 2017.
  10. R. Qiao, N. H. Tran, L. Xin, X. Chen, M. Li, B. Shan, and A. Ghodsi, “Computationally instrument-resolution-independent de novo peptide sequencing for high-resolution devices,” Nature Machine Intelligence, vol. 3, no. 5, pp. 420–425, 2021.
  11. N. H. Tran, R. Qiao, L. Xin, X. Chen, C. Liu, X. Zhang, B. Shan, A. Ghodsi, and M. Li, “Deep learning enables de novo peptide sequencing from data-independent-acquisition mass spectrometry,” Nature methods, vol. 16, no. 1, pp. 63–66, 2019.
  12. K. Liu, Y. Ye, and H. Tang, “Pepnet: a fully convolutional neural network for de novo peptide sequencing,” ResearchGate.com, 2022.
  13. Y. Li, “Dpnovo: A deep learning model combined with dynamic programming for de novo peptide sequencing,” Electronic Thesis and Dissertation Repository, 2023.
  14. K. Karunratanakul, H.-Y. Tang, D. W. Speicher, E. Chuangsuwanich, and S. Sriswasdi, “Uncovering thousands of new peptides with sequence-mask-search hybrid de novo peptide sequencing framework,” Molecular & Cellular Proteomics, vol. 18, no. 12, pp. 2478–2491, 2019.
  15. H. Yang, H. Chi, W.-F. Zeng, W.-J. Zhou, and S.-M. He, “pnovo 3: precise de novo peptide sequencing using a learning-to-rank framework,” Bioinformatics, vol. 35, no. 14, pp. i183–i190, 2019.
  16. X.-X. Zhou, W.-F. Zeng, H. Chi, C. Luo, C. Liu, J. Zhan, S.-M. He, and Z. Zhang, “pdeep: predicting ms/ms spectra of peptides with deep learning,” Analytical chemistry, vol. 89, no. 23, pp. 12 690–12 697, 2017.
  17. M. Yilmaz, W. Fondrie, W. Bittremieux, S. Oh, and W. S. Noble, “De novo mass spectrometry peptide sequencing with a transformer model,” in International Conference on Machine Learning.   PMLR, 2022, pp. 25 514–25 522.
  18. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
  19. J. Zhang, L. Xin, B. Shan, W. Chen, M. Xie, D. Yuen, W. Zhang, Z. Zhang, G. A. Lajoie, and B. Ma, “Peaks db: de novo sequencing assisted database search for sensitive and accurate peptide identification,” Molecular & cellular proteomics, vol. 11, no. 4, 2012.
  20. M.-T. Luong, H. Pham, and C. D. Manning, “Effective approaches to attention-based neural machine translation,” arXiv preprint arXiv:1508.04025, 2015.

Summary

We haven't generated a summary for this paper yet.