Papers
Topics
Authors
Recent
Search
2000 character limit reached

HERMES: High-Performance RISC-V Memory Hierarchy for ML Workloads

Published 17 Mar 2025 in cs.AR and cs.PF | (2503.13064v2)

Abstract: The growth of ML workloads has underscored the importance of efficient memory hierarchies to address bandwidth, latency, and scalability challenges. HERMES focuses on optimizing memory subsystems for RISC-V architectures to meet the computational needs of ML models such as CNNs, RNNs, and Transformers. This project explores state-of-the-art techniques such as advanced prefetching, tensor-aware caching, and hybrid memory models. The cornerstone of HERMES is the integration of shared L3 caches with fine-grained coherence protocols equipped with specialized pathways to deep-learning accelerators such as Gemmini. Simulation tools like Gem5 and DRAMSim2 were used to evaluate baseline performance and scalability under representative ML workloads. The findings of this study highlight the design choices, and the anticipated challenges, paving the way for low-latency scalable memory operations for ML applications.

Authors (1)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.