Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 75 tok/s
Gemini 2.5 Pro 51 tok/s Pro
GPT-5 Medium 20 tok/s Pro
GPT-5 High 18 tok/s Pro
GPT-4o 95 tok/s Pro
Kimi K2 193 tok/s Pro
GPT OSS 120B 467 tok/s Pro
Claude Sonnet 4 37 tok/s Pro
2000 character limit reached

Accelerating Pythonic coupled cluster implementations: a comparison between CPUs and GPUs (2310.04559v1)

Published 6 Oct 2023 in physics.chem-ph

Abstract: We scrutinize how to accelerate the bottleneck operations of Pythonic coupled cluster implementations performed on a \texttt{NVIDIA} Tesla V100S PCIe 32GB (rev 1a) Graphics Processing Unit (GPU). The \texttt{NVIDIA} Compute Unified Device Architecture (CUDA) API is interacted with via \texttt{CuPy}, an open-source library for Python, designed as a \texttt{NumPy} drop-in replacement for GPUs. The implementation uses the Cholesky linear algebra domain and is done in {PyBEST}, the Pythonic Black-box Electronic Structure Tool -- a fully-fledged modern electronic structure software package. Due to the limitations of Video Memory (VRAM), the GPU calculations must be performed batch-wise. Timing results of some contractions containing large tensors are presented. The \texttt{CuPy} implementation leads to factor 10 speed-up compared to calculations on 36 CPUs. Furthermore, we benchmark several Pythonic routines for time and memory requirements to identify the optimal choice of the tensor contraction operations available. Finally, we compare an example CCSD and pCCD-LCCSD calculation performed solely on CPUs to their CPU--GPU hybrid implementation. Our results indicate a significant speed-up (up to a factor of 16 regarding the bottleneck operations) when offloading specific contractions to the GPU using \texttt{CuPy}.

Citations (2)

Summary

We haven't generated a summary for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.