Adaptive Ensemble Biomolecular Simulations at Scale (1804.04736v5)
Abstract: Recent advances in both theory and methods have created opportunities to simulate biomolecular processes more efficiently using adaptive ensemble simulations. Ensemble-based simulations are used widely to compute a number of individual simulation trajectories and analyze statistics across them. Adaptive ensemble simulations offer a further level of sophistication and flexibility by enabling high-level algorithms to control simulations based on intermediate results. Novel high-level algorithms require sophisticated approaches to utilize the intermediate data during runtime. Thus, there is a need for scalable software systems to support adaptive ensemble-based applications. We describe the operations in executing adaptive workflows, classify different types of adaptations, and describe challenges in implementing them in software tools. We enhance Ensemble Toolkit (EnTK) -- an ensemble execution system -- to support the scalable execution of adaptive workflows on HPC systems, and characterize the adaptation overhead in EnTK. We implement two high-level adaptive ensemble algorithms -- expanded ensemble and Markov state modeling, and execute upto $2{12}$ ensemble members, on thousands of cores on three distinct HPC platforms. We highlight scientific advantages enabled by the novel capabilities of our approach. To the best of our knowledge, this is the first attempt at describing and implementing multiple adaptive ensemble workflows using a common conceptual and implementation framework.