Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
93 tokens/sec
Gemini 2.5 Pro Premium
47 tokens/sec
GPT-5 Medium
32 tokens/sec
GPT-5 High Premium
29 tokens/sec
GPT-4o
87 tokens/sec
DeepSeek R1 via Azure Premium
93 tokens/sec
GPT OSS 120B via Groq Premium
483 tokens/sec
Kimi K2 via Groq Premium
203 tokens/sec
2000 character limit reached

Comparing CPU and GPU compute of PERMANOVA on MI300A (2505.04556v1)

Published 7 May 2025 in cs.DC, cs.PF, and q-bio.QM

Abstract: Comparing the tradeoffs of CPU and GPU compute for memory-heavy algorithms is often challenging, due to the drastically different memory subsystems on host CPUs and discrete GPUs. The AMD MI300A is an exception, since it sports both CPU and GPU cores in a single package, all backed by the same type of HBM memory. In this paper we analyze the performance of Permutational Multivariate Analysis of Variance (PERMANOVA), a non-parametric method that tests whether two or more groups of objects are significantly different based on a categorical factor. This method is memory-bound and has been recently optimized for CPU cache locality. Our tests show that GPU cores on the MI300A prefer the brute force approach instead, significantly outperforming the CPU-based implementation. The significant benefit of Simultaneous Multithreading (SMT) was also a pleasant surprise.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Authors (1)