Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 134 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 25 tok/s Pro
GPT-5 High 28 tok/s Pro
GPT-4o 86 tok/s Pro
Kimi K2 203 tok/s Pro
GPT OSS 120B 445 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Direct Numerical Simulations of turbulent flows using high-order Asynchrony-Tolerant schemes: accuracy and performance (2003.10561v1)

Published 23 Mar 2020 in physics.comp-ph and physics.flu-dyn

Abstract: Direct numerical simulations (DNS) are an indispensable tool for understanding the fundamental physics of turbulent flows. Because of their steep increase in computational cost with Reynolds number ($R_{\lambda}$), well-resolved DNS are realizable only on massively parallel supercomputers, even at moderate $R_{\lambda}$. However, at extreme scales, the communications and synchronizations between processing elements (PEs) involved in current approaches become exceedingly expensive and are expected to be a major bottleneck to scalability. In order to overcome this challenge, we developed algorithms using the so-called Asynchrony-Tolerant (AT) schemes that relax communication and synchronization constraints at a mathematical level, to perform DNS of decaying and solenoidally forced compressible turbulence. Asynchrony is introduced using two approaches, one that avoids synchronizations and the other that avoids communications. These result in periodic and random delays, respectively, at PE boundaries. We show that both asynchronous algorithms accurately resolve the large-scale and small-scale motions of turbulence, including instantaneous and intermittent fields. We also show that in asynchronous simulations the communication time is a relatively smaller fraction of the total computation time, especially at large processor count, compared to standard synchronous simulations. As a consequence, we observe improved parallel scalability up to $262144$ processors for both asynchronous algorithms.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.