Papers
Topics
Authors
Recent
2000 character limit reached

RoseNet: Predicting Energy Metrics of Double InDel Mutants Using Deep Learning

Published 20 Oct 2023 in q-bio.BM, cs.AI, cs.LG, and q-bio.QM | (2310.13806v1)

Abstract: An amino acid insertion or deletion, or InDel, can have profound and varying functional impacts on a protein's structure. InDel mutations in the transmembrane conductor regulator protein for example give rise to cystic fibrosis. Unfortunately performing InDel mutations on physical proteins and studying their effects is a time prohibitive process. Consequently, modeling InDels computationally can supplement and inform wet lab experiments. In this work, we make use of our data sets of exhaustive double InDel mutations for three proteins which we computationally generated using a robotics inspired inverse kinematics approach available in Rosetta. We develop and train a neural network, RoseNet, on several structural and energetic metrics output by Rosetta during the mutant generation process. We explore and present how RoseNet is able to emulate the exhaustive data set using deep learning methods, and show to what extent it can predict Rosetta metrics for unseen mutant sequences with two InDels. RoseNet achieves a Pearson correlation coefficient median accuracy of 0.775 over all Rosetta scores for the largest protein. Furthermore, a sensitivity analysis is performed to determine the necessary quantity of data required to accurately emulate the structural scores for computationally generated mutants. We show that the model can be trained on minimal data (<50%) and still retain a high level of accuracy.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (15)
  1. The Rosetta all-atom energy function for macromolecular modeling and design. Journal of chemical theory and computation 13, 6 (2017), 3031–3048.
  2. Predicting the functional effect of amino acid substitutions and indels. (2012).
  3. Rhiju Das and David Baker. 2008. Macromolecular modeling with rosetta. Annu. Rev. Biochem. 77 (2008), 363–382.
  4. John W Drake and Richard H Baltz. 1976. The biochemistry of mutagenesis. Annual review of biochemistry 45, 1 (1976), 11–37.
  5. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770–778.
  6. Peter J. Huber. 1964. Robust Estimation of a Location Parameter. Annals of Statistics 53 (1) (1964), 73–101.
  7. Elucidating the structural impacts of protein InDels. Biomolecules 12, 10 (2022), 1435.
  8. ROSETTA3: an object-oriented software suite for the simulation and design of macromolecules. In Methods in enzymology. Vol. 487. Elsevier, 545–574.
  9. Majid Masso and Iosif I Vaisman. 2008. Accurate prediction of stability changes in protein mutants by combining machine learning with structure based computational mutagenesis. Bioinformatics 24, 18 (2008), 2002–2009.
  10. Comparison of predicted and actual consequences of missense mutations. Proceedings of the National Academy of Sciences 112, 37 (2015), E5189–E5198.
  11. Computational alanine scanning mutagenesis—an improved methodological approach. Journal of computational chemistry 28, 3 (2007), 644–654.
  12. Kim L Morrison and Gregory A Weiss. 2001. Combinatorial alanine-scanning. Current opinion in chemical biology 5, 3 (2001), 302–307.
  13. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019).
  14. Insertions and deletions in protein evolution and engineering. Biotechnology Advances 60 (2022), 108010.
  15. Exhaustive In-silico Simulation of Single Amino Acid Insertion and Deletion Mutations. In 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE, 3498–3503.

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Paper to Video (Beta)

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.