Dissecting RISC-V Performance: Practical PMU Profiling and Hardware-Agnostic Roofline Analysis on Emerging Platforms (2507.22451v1)
Abstract: As RISC-V architectures proliferate across embedded and high-performance domains, developers face persistent challenges in performance optimization due to fragmented tooling, immature hardware features, and platform-specific defects. This paper delivers a pragmatic methodology for extracting actionable performance insights on RISC-V systems, even under constrained or unreliable hardware conditions. We present a workaround to circumvent hardware bugs in one of the popular RISC-V implementations, enabling robust event sampling. For memory-compute bottleneck analysis, we introduce compiler-driven Roofline tooling that operates without hardware PMU dependencies, leveraging LLVM-based instrumentation to derive operational intensity and throughput metrics directly from application IR. Our open source toolchain automates these workarounds, unifying PMU data correction and compiler-guided Roofline construction into a single workflow.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.