Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Code modernization strategies for short-range non-bonded molecular dynamics simulations (2109.10876v3)

Published 22 Sep 2021 in cs.DC and physics.comp-ph

Abstract: Modern HPC systems are increasingly relying on greater core counts and wider vector registers. Thus, applications need to be adapted to fully utilize these hardware capabilities. One class of applications that can benefit from this increase in parallelism are molecular dynamics simulations. In this paper, we describe our efforts at modernizing the ESPResSo++ molecular dynamics simulation package by restructuring its particle data layout for efficient memory accesses and applying vectorization techniques to benefit the calculation of short-range non-bonded forces, which results in an overall three times speedup and serves as a baseline for further optimizations. We also implement fine-grained parallelism for multi-core CPUs through HPX, a C++ runtime system which uses lightweight threads and an asynchronous many-task approach to maximize concurrency. Our goal is to evaluate the performance of an HPX-based approach compared to the bulk-synchronous MPI-based implementation. This requires the introduction of an additional layer to the domain decomposition scheme that defines the task granularity. On spatially inhomogeneous systems, which impose a corresponding load-imbalance in traditional MPI-based approaches, we demonstrate that by choosing an optimal task size, the efficient work-stealing mechanisms of HPX can overcome the overhead of communication resulting in an overall 1.4 times speedup compared to the baseline MPI version.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (33)
  1. doi:10.1006/jcph.1995.1039.
  2. doi:10.1016/j.cpc.2021.108171.
  3. doi:10.1016/j.softx.2015.06.001.
  4. doi:10.1063/5.0014475.
  5. doi:10.1016/j.cpc.2012.12.004.
  6. arXiv:1806.10841, doi:10.1016/j.cpc.2018.12.017.
  7. arXiv:1806.05713, doi:10.1016/j.cpc.2018.10.028.
  8. doi:10.1002/spe.1149.
  9. doi:10.1109/SC.2012.3.
  10. doi:https://doi.org/10.1016/j.cpc.2012.07.013. URL https://www.sciencedirect.com/science/article/pii/S0010465512002524
  11. doi:10.1016/j.cpc.2019.06.020.
  12. doi:10.1109/MTAGS.2008.4777912.
  13. doi:10.5281/zenodo.4988098.
  14. doi:10.1016/j.advengsoft.2020.102962.
  15. doi:10.21105/joss.02352.
  16. doi:10.1093/oso/9780198803195.001.0001.
  17. doi:10.1103/physrev.159.98.
  18. doi:10.1063/1.442716.
  19. doi:10.1021/ct900369w.
  20. doi:10.1021/acs.jctc.8b00617.
  21. doi:10.48550/arXiv.2109.11083.
  22. doi:10.1109/TPDS.2021.3097283.
  23. doi:10.1109/P3HPC49587.2019.00012.
  24. doi:10.1109/IPDPS.2017.13.
  25. doi:10.1007/s10915-019-00960-z.
  26. doi:10.5281/zenodo.4059746. URL https://doi.org/10.5281/zenodo.4059746
  27. doi:10.1063/1.458541.
  28. doi:10.5281/zenodo.4157471.
  29. doi:10.1063/1.4921347.
  30. doi:10.1088/1361-648X/abed1d.
  31. doi:10.1177/10943420211022829.
  32. doi:10.1016/j.cpc.2020.107177.
  33. doi:10.1093/mnras/stab937.
Citations (2)

Summary

We haven't generated a summary for this paper yet.