Towards a Benchmark for Markov State Models: The Folding of HP35 (2306.04331v2)
Abstract: Adopting a $300 \, \mu$s-long molecular dynamics (MD) trajectory of the reversible folding of villin headpiece (HP35) published by D. E. Shaw Research, we recently constructed a Markov state model (MSM) of the folding process based on interresidue contacts [J. Chem. Theory Comput. 2023, ${\bf {19}}$, 3391]. The model reproduces the MD folding times of the system and predicts that both the native basin and the unfolded region of the free energy landscape are partitioned into several metastable substates that are structurally well characterized. Recognizing the need to establish well-defined but nontrivial benchmark problems, in this Perspective we study to what extent and in what sense this MSM may be employed as a reference model. To this end, we test the robustness of the MSM by comparing it to models that use alternative combinations of features, dimensionality reduction methods and clustering schemes. The study suggests some main characteristics of the folding of HP35, which should be reproduced by any other competitive model of the system. Moreover, the discussion reveals which parts of the MSM workflow matter most for the considered problem, and illustrates the promises and possible pitfalls of state-based models for the interpretation of biomolecular simulations.
- Berendsen, H. J. C. Simulating the Physical World; Cambridge University Press: Cambridge, 2007
- Bowman, G. R.; Pande, V. S.; Noé, F. An Introduction to Markov State Models; Springer: Heidelberg, 2013
- Sculley, D. Web-scale k-means clustering. Proceedings of the 19th international conference on World wide web. 2010; pp 1177–1178
- Arthur, D.; Vassilvitskii, S. K-means++ the advantages of careful seeding. Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms. 2007; pp 1027–1035