Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Integrating Deep Learning in Domain Sciences at Exascale (2011.11188v1)

Published 23 Nov 2020 in cs.LG

Abstract: This paper presents some of the current challenges in designing deep learning AI and integrating it with traditional high-performance computing (HPC) simulations. We evaluate existing packages for their ability to run deep learning models and applications on large-scale HPC systems efficiently, identify challenges, and propose new asynchronous parallelization and optimization techniques for current large-scale heterogeneous systems and upcoming exascale systems. These developments, along with existing HPC AI software capabilities, have been integrated into MagmaDNN, an open-source HPC deep learning framework. Many deep learning frameworks are targeted at data scientists and fall short in providing quality integration into existing HPC workflows. This paper discusses the necessities of an HPC deep learning framework and how those needs can be provided (e.g., as in MagmaDNN) through a deep integration with existing HPC libraries, such as MAGMA and its modular memory management, MPI, CuBLAS, CuDNN, MKL, and HIP. Advancements are also illustrated through the use of algorithmic enhancements in reduced- and mixed-precision, as well as asynchronous optimization methods. Finally, we present illustrations and potential solutions for enhancing traditional compute- and data-intensive applications at ORNL and UTK with AI. The approaches and future challenges are illustrated in materials science, imaging, and climate applications.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (11)
  1. Rick Archibald (9 papers)
  2. Edmond Chow (25 papers)
  3. Eduardo D'Azevedo (2 papers)
  4. Jack Dongarra (24 papers)
  5. Markus Eisenbach (41 papers)
  6. Rocco Febbo (1 paper)
  7. Florent Lopez (1 paper)
  8. Daniel Nichols (10 papers)
  9. Stanimire Tomov (9 papers)
  10. Kwai Wong (1 paper)
  11. Junqi Yin (30 papers)
Citations (5)

Summary

We haven't generated a summary for this paper yet.