Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 134 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 30 tok/s Pro
GPT-5 High 23 tok/s Pro
GPT-4o 99 tok/s Pro
Kimi K2 190 tok/s Pro
GPT OSS 120B 425 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Temporal Subspace Clustering for Molecular Dynamics Data (2408.00056v1)

Published 31 Jul 2024 in cs.LG, cs.IR, and physics.chem-ph

Abstract: We introduce MOSCITO (MOlecular Dynamics Subspace Clustering with Temporal Observance), a subspace clustering for molecular dynamics data. MOSCITO groups those timesteps of a molecular dynamics trajectory together into clusters in which the molecule has similar conformations. In contrast to state-of-the-art methods, MOSCITO takes advantage of sequential relationships found in time series data. Unlike existing work, MOSCITO does not need a two-step procedure with tedious post-processing, but directly models essential properties of the data. Interpreting clusters as Markov states allows us to evaluate the clustering performance based on the resulting Markov state models. In experiments on 60 trajectories and 4 different proteins, we show that the performance of MOSCITO achieves state-of-the-art performance in a novel single-step method. Moreover, by modeling temporal aspects, MOSCITO obtains better segmentation of trajectories, especially for small numbers of clusters.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (37)
  1. 3D shape histograms for similarity search and classification in spatial databases. In Advances in Spatial Databases: 6th International Symposium, SSD’99 Hong Kong, China, July 20—23, 1999 Proceedings 6. Springer, 207–226.
  2. PLUMED: A portable plugin for free-energy calculations with molecular dynamics. Computer Physics Communications 180, 10 (2009), 1961–1972. https://doi.org/10.1016/j.cpc.2009.05.011
  3. Peter Deuflhard and Marcus Weber. 2005. Robust Perron cluster analysis in conformation dynamics. Linear algebra and its applications 398 (2005), 161–184.
  4. Ehsan Elhamifar and René Vidal. 2013. Sparse Subspace Clustering: Algorithm, Theory, and Applications. IEEE Transactions on Pattern Analysis and Machine Intelligence 35, 11 (2013), 2765–2781. https://doi.org/10.1109/TPAMI.2013.57
  5. Spectral clustering of Markov chain transition matrices with complex eigenvalues. J. Comput. Appl. Math. (2022). under review.
  6. Spatial subspace clustering for drill hole spectral data. Journal of Applied Remote Sensing 8, 1 (2014), 083644–083644.
  7. F Ulrich Hartl. 2017. Protein misfolding diseases. Annual review of biochemistry 86 (2017), 21–26.
  8. Subspace clustering. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 2, 4 (2012), 351–364.
  9. Temporal subspace clustering for human motion segmentation. In Proceedings of the IEEE international conference on computer vision. 4453–4461.
  10. Robust recovery of subspace structures by low-rank representation. IEEE transactions on pattern analysis and machine intelligence 35, 1 (2012), 171–184.
  11. Robert T. McGibbon. 2014. Fs MD Trajectories. (5 2014). https://doi.org/10.6084/m9.figshare.1030363.v1
  12. Lutz Molgedey and Heinz Georg Schuster. 1994. Separation of a mixture of independent signals using time delayed correlations. Physical review letters 72, 23 (1994), 3634.
  13. Frank Noé and Feliks Nuske. 2013. A variational approach to modeling slow processes in stochastic dynamical systems. Multiscale Modeling & Simulation 11, 2 (2013), 635–655.
  14. Variational approach to molecular kinetics. Journal of chemical theory and computation 10, 4 (2014), 1739–1752.
  15. Juliana Palma and Gustavo Pierdominici-Sottile. 2023. On the Uses of PCA to Characterise Molecular Dynamics Simulations of Biological Macromolecules: Basics and Tips for an Effective Use. ChemPhysChem 24, 2 (2023), e202200491. https://doi.org/10.1002/cphc.202200491 arXiv:https://chemistry-europe.onlinelibrary.wiley.com/doi/pdf/10.1002/cphc.202200491
  16. Subspace clustering for high dimensional data: a review. Acm sigkdd explorations newsletter 6, 1 (2004), 90–105.
  17. Identification of slow molecular order parameters for Markov model construction. The Journal of chemical physics 139, 1 (2013), 07B604_1.
  18. A Survey on High-Dimensional Subspace Clustering. Mathematics 11, 2 (Jan 2023), 436. https://doi.org/10.3390/math11020436
  19. Determining geometrically stable domains in molecular conformation sets. Journal of Chemical Theory and Computation 8, 8 (2012), 2588–2599.
  20. Susanna Röblitz and Marcus Weber. 2013. Fuzzy spectral clustering by PCCA+: Application to Markov state models and data classification. Advances in Data Analysis and Classification 7 (2013), 147–179. Issue 2. https://doi.org/10.1007/s11634-013-0134-6
  21. Variational selection of features for molecular kinetics. The Journal of Chemical Physics 150, 19 (may 2019), 194108. https://doi.org/10.1063/1.5083040
  22. PyEMMA 2: A Software Package for Estimation, Validation, and Analysis of Markov Models. Journal of Chemical Theory and Computation 11 (Oct. 2015), 5525–5542. https://doi.org/10.1021/acs.jctc.5b00743
  23. Steffen Schultze and Helmut Grubmüller. 2021. Time-Lagged Independent Component Analysis of Random Walks and Protein Dynamics. Journal of Chemical Theory and Computation 17, 9 (2021), 5766–5776. https://doi.org/10.1021/acs.jctc.1c00273 arXiv:https://doi.org/10.1021/acs.jctc.1c00273 PMID: 34449229.
  24. Christian R Schwantes and Vijay S Pande. 2013. Improvements in Markov state model construction reveal many non-native interactions in the folding of NTL9. Journal of chemical theory and computation 9, 4 (2013), 2000–2009.
  25. Anton 2: Raising the Bar for Performance and Programmability in a Special-Purpose Molecular Dynamics Supercomputer. In SC ’14: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. 41–53. https://doi.org/10.1109/SC.2014.9
  26. A. Shrake and J.A. Rupley. 1973. Environment and exposure to solvent of protein atoms. Lysozyme and insulin. Journal of Molecular Biology 79, 2 (1973), 351–371. https://doi.org/10.1016/0022-2836(73)90011-9
  27. What Markov State Models Can and Cannot Do: Correlation versus Path-Based Observables in Protein-Folding Models. Journal of Chemical Theory and Computation 17 (5 2021), 3119–3133. Issue 5. https://doi.org/10.1021/acs.jctc.0c01154
  28. Ivan Syzonenko and Joshua L. Phillips. 2018. Hybrid Spectral/Subspace Clustering of Molecular Dynamics Simulations. In Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics (Washington, DC, USA) (BCB ’18). Association for Computing Machinery, New York, NY, USA, 325–330. https://doi.org/10.1145/3233547.3233595
  29. Subspace clustering for sequential data. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1019–1026.
  30. Ivana Tošić and Pascal Frossard. 2011. Dictionary learning. IEEE Signal Processing Magazine 28, 2 (2011), 27–38.
  31. Nagarajan Vaidehi and Abhinandan Jain. 2015. Internal coordinate molecular dynamics: A foundation for multiscale dynamics. The Journal of Physical Chemistry B 119, 4 (2015), 1233–1242.
  32. René Vidal. 2011. Subspace clustering. IEEE Signal Processing Magazine 28, 2 (2011), 52–68.
  33. Marcus Weber and Tobias Galliat. 2002. Characterization of transition states in conformational dynamics using Fuzzy sets. (2002).
  34. Perron cluster analysis and its connection to graph partitioning for noisy data. Citeseer.
  35. Introduction to Markov state modeling with the PyEMMA software [Article v1.0]. Living Journal of Computational Molecular Science 1 (2019). Issue 1. https://doi.org/10.33011/livecoms.1.1.5965
  36. Ordered subspace clustering with block-diagonal priors. IEEE transactions on cybernetics 46, 12 (2015), 3209–3219.
  37. Hao Wu and Frank Noé. 2017. Variational approach for learning Markov processes from time series data. https://doi.org/10.48550/ARXIV.1707.04659

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.