Predicting protein folding dynamics using sequence information (2505.17237v1)
Abstract: Natural protein sequences somehow encode the structural forms that these molecules adopt. Recent developments in structure-prediction are agnostic to the mechanisms by which proteins fold and represent them as static objects. However, the amino acid sequences also encode information about how the folding process can happen, and how variations in the sequences impact on the populations of the distinct structural forms that proteins acquire. Here we present a method to infer protein folding dynamics based only on sequence information. For this, we will rely first on the obtention of a precise 'evolutionary field' from the observed variations in the sequences of homologous proteins. We then show how to map the energetics to a coarse-grained folding model where the protein is treated as a string of foldons that interact. We then describe how, for any given protein sequence of a family, the equilibrium folding curve can be computed and how the emergence of protein folding sub-domains can be identified. We finally present protocols to analyze how mutations perturb both the folding stability and the cooperativity, that represent predictions for a deep-mutational scan of a protein of interest.