Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MGPU-TSM: A Multi-GPU System with Truly Shared Memory (2008.02300v2)

Published 5 Aug 2020 in cs.AR

Abstract: The sizes of GPU applications are rapidly growing. They are exhausting the compute and memory resources of a single GPU, and are demanding the move to multiple GPUs. However, the performance of these applications scales sub-linearly with GPU count because of the overhead of data movement across multiple GPUs. Moreover, a lack of hardware support for coherency exacerbates the problem because a programmer must either replicate the data across GPUs or fetch the remote data using high-overhead off-chip links. To address these problems, we propose a multi-GPU system with truly shared memory (MGPU-TSM), where the main memory is physically shared across all the GPUs. We eliminate remote accesses and avoid data replication using an MGPU-TSM system, which simplifies the memory hierarchy. Our preliminary analysis shows that MGPU-TSM with 4 GPUs performs, on average, 3.9x? better than the current best performing multi-GPU configuration for standard application benchmarks.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Saiful A. Mojumder (3 papers)
  2. Yifan Sun (183 papers)
  3. Leila Delshadtehrani (4 papers)
  4. Yenai Ma (3 papers)
  5. Trinayan Baruah (3 papers)
  6. José L. Abellán (10 papers)
  7. John Kim (23 papers)
  8. David Kaeli (25 papers)
  9. Ajay Joshi (25 papers)
Citations (5)

Summary

We haven't generated a summary for this paper yet.