Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Short reasons for long vectors in HPC CPUs: a study based on RISC-V (2309.06865v2)

Published 13 Sep 2023 in cs.DC and cs.AR

Abstract: For years, SIMD/vector units have enhanced the capabilities of modern CPUs in High-Performance Computing (HPC) and mobile technology. Typical commercially-available SIMD units process up to 8 double-precision elements with one instruction. The optimal vector width and its impact on CPU throughput due to memory latency and bandwidth remain challenging research areas. This study examines the behavior of four computational kernels on a RISC-V core connected to a customizable vector unit, capable of operating up to 256 double precision elements per instruction. The four codes have been purposefully selected to represent non-dense workloads: SpMV, BFS, PageRank, FFT. The experimental setup allows us to measure their performance while varying the vector length, the memory latency, and bandwidth. Our results not only show that larger vector lengths allow for better tolerance of limitations in the memory subsystem but also offer hope to code developers beyond dense linear algebra.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Pablo Vizcaino (3 papers)
  2. Georgios Ieronymakis (2 papers)
  3. Nikolaos Dimou (3 papers)
  4. Vassilis Papaefstathiou (5 papers)
  5. Jesus Labarta (21 papers)
  6. Filippo Mantovani (9 papers)
Citations (4)

Summary

We haven't generated a summary for this paper yet.