Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
GPT-4o
Gemini 2.5 Pro Pro
o3 Pro
GPT-4.1 Pro
DeepSeek R1 via Azure Pro
2000 character limit reached

Porting DDalphaAMG solver to K computer (1811.00355v2)

Published 1 Nov 2018 in hep-lat and physics.comp-ph

Abstract: We port Domain-Decomposed-alpha-AMG solver to the K computer. The system has 8 cores and 16 GB memory per node, of which theoretical peak is 128 GFlops (82,944 nodes in total). Its feature, as many as 256 registers per core and as large as 0.5 byte/Flop ratio, requires a different tuning from other machines. In order to use more registers, we change some of the data structure and rewrite matrix-vector operations with intrinsics. The performance is improved by more than a factor two for twelve solves including the setup. The efficiency is still about 5% after the optimization, which is lower than a previously tuned mixed precision solver for the K computer, 22%. The throughput is, however, more than two times better for a physical point configuration.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.