Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
153 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Performance report and optimized implementation of Weather & Climate Dwarfs on GPU, MIC and Optalysys Optical Processor (1908.06096v1)

Published 16 Aug 2019 in cs.DC

Abstract: This document is one of the deliverable reports created for the ESCAPE project. ESCAPE stands for Energy-efficient Scalable Algorithms for Weather Prediction at Exascale. The project develops world-class, extreme-scale computing capabilities for European operational numerical weather prediction and future climate models. This is done by identifying Weather & Climate dwarfs which are key patterns in terms of computation and communication (in the spirit of the Berkeley dwarfs). These dwarfs are then optimised for different hardware architectures (single and multi-node) and alternative algorithms are explored. Performance portability is addressed through the use of domain specific languages. Here we summarize the work performed on optimizations of the dwarfs on CPUs, Xeon Phi, GPUs and on the Optalysys optical processor. We limit ourselves to a subset of the dwarf configurations and to problem sizes small enough to execute on a single node. Also, we use time-to-solution as the main performance metric. Multi-node optimizations of the dwarfs and energy-specific optimizations are beyond the scope of this report and will be described in Deliverable D3.4. To cover the important algorithmic motifs we picked dwarfs related to the dynamical core as well as column physics. Specifically, we focused on the formulation relevant to spectral codes like ECMWF's IFS code. The main findings of this report are: (a) Acceleration of 1.1x - 2.5x of the dwarfs on CPU based systems using compiler directives, (b) order of magnitude acceleration of the dwarfs on GPUs (23x for spectral transform, 9x for MPDATA) using data locality optimizations and (c) demonstrated feasibility of a spectral transform in a purely optical fashion.

Citations (1)

Summary

We haven't generated a summary for this paper yet.