A Deep Autoencoder Framework for Discovery of Metastable Ensembles in Biomacromolecules

Published 1 Jun 2021 in physics.chem-ph and physics.bio-ph | (2106.00724v1)

Abstract: Mini-proteins and peptides manifest dynamic conformational fluctuation and involve mutual interconversion among metastable states. A robust mapping of the conformational landscape underlying mini-proteins and peptides often requires low-dimensional projection of the conformational ensemble along optimized collective variables. However, the traditional choice for the collective variable (CV) is often limited by user-intuition and prior knowledge about the system, which lacks a rigorous assessment of their optimality over other candidate CVs. To address this issue, we propose a generic approach in which we first choose the possible combinations of inter-residue Calpha-distances within a given macromolecule as a set of input CVs. Subsequently we derive a non-linear combination of latent-space embedded collective variables via auto-encoding the unbiased MD simulation trajectories within the framework of feed-forward neural network. We demonstrate the ability of the derived latent space variables in elucidating the conformational landscape in three hierarchically complex systems. When the conformational dynamics is resolved along the latent space CVs, it identifies key metastable states of a bead-in-a-spring polymer. The combination of the adopted dimensionally reduction technique with a Markov state model, built on the derived latent space, efficiently projects the free energy landscape of GB1 beta-hairpin, revealing multiple spatially well-resolved and kinetically well-separated metastable conformations. A quantitative comparison based on variational approach to Markov Process of the auto encoder-derived latent-space CVs with the ones obtained PCA or TICA confirms the optimality of the former. Finally, as a practical application, we demonstrate that the auto-encoder derived CVs successfully predict the reinforced folding of Trp-cage mini-protein in an aqueous osmolyte solution.