Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Potential benefits of a block-space GPU approach for discrete tetrahedral domains (1606.08881v1)

Published 28 Jun 2016 in cs.DC

Abstract: The study of data-parallel domain re-organization and thread-mapping techniques are relevant topics as they can increase the efficiency of GPU computations when working on spatial discrete domains with non-box-shaped geometry. In this work we study the potential benefits of applying a succint data re-organization of a tetrahedral data-parallel domain of size $\mathcal{O}(n3)$ combined with an efficient block-space GPU map of the form $g:\mathbb{N} \rightarrow \mathbb{N}3$. Results from the analysis suggest that in theory the combination of these two optimizations produce significant performance improvement as block-based data re-organization allows a coalesced one-to-one correspondence at local thread-space while $g(\lambda)$ produces an efficient block-space spatial correspondence between groups of data and groups of threads, reducing the number of unnecessary threads from $O(n3)$ to $O(n2\rho3)$ where $\rho$ is the linear block-size and typically $\rho3 \ll n$. From the analysis, we obtained that a block based succint data re-organization can provide up to $2\times$ improved performance over a linear data organization while the map can be up to $6\times$ more efficient than a bounding box approach. The results from this work can serve as a useful guide for a more efficient GPU computation on tetrahedral domains found in spin lattice, finite element and special n-body problems, among others.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Cristóbal A. Navarro (21 papers)
  2. Nancy Hitschfeld (15 papers)
  3. Benjamín Bustos (4 papers)
Citations (10)

Summary

We haven't generated a summary for this paper yet.