Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
129 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Optimizing the Gravitational Tree Algorithm for Many-Core Processors (2312.06102v2)

Published 11 Dec 2023 in astro-ph.IM, astro-ph.CO, and astro-ph.GA

Abstract: Gravitational $N$-body simulations calculate numerous interactions between particles. The tree algorithm reduces these calculations by constructing a hierarchical oct-tree structure and approximating gravitational forces on particles. Over the last three decades, the tree algorithm has been extensively used in large-scale simulations, and its parallelization in distributed memory environments has been well studied. However, recent supercomputers are equipped with many CPU cores per node, and optimizations of the tree construction in shared memory environments are becoming crucial. We propose a novel tree construction method in contrast to the conventional top-down approach. It first creates all leaf cells without traversing the tree and then constructs the remaining cells by a bottom-up approach. We evaluated the performance of our novel method on the supercomputer Fugaku and an Intel machine. On a single thread, our method accelerates one of the most time-consuming processes of the conventional tree construction method by a factor of above 3.0 on Fugaku and 2.2 on the Intel machine. Furthermore, as the number of threads increases, our parallel tree construction time reduces considerably. Compared to the conventional sequential tree construction method, we achieve a speedup of over 45 on 48 threads of Fugaku and more than 56 on 112 threads of the Intel machine. In stark contrast to the conventional method, the tree construction with our method no longer constitutes a bottleneck in the tree algorithm, even when using many threads.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com