Papers
Topics
Authors
Recent
2000 character limit reached

Enabling full-speed random access to the entire memory on the A100 GPU (2405.11425v1)

Published 19 May 2024 in cs.PF and cs.AR

Abstract: We describe some features of the A100 memory architecture. In particular, we give a technique to reverse-engineer some hardware layout information. Using this information, we show how to avoid TLB issues to obtain full-speed random HBM access to the entire memory, as long as we constrain any particular thread to a reduced access window of less than 64GB.

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Open Problems

We found no open problems mentioned in this paper.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (1)

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 3 tweets with 39 likes about this paper.