Papers

Topics

Authors

Recent

View all

Detailed Answer

Quick Answer

Concise responses based on abstracts only

Detailed Answer

Well-researched responses based on abstracts and relevant paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses

Gemini 2.5 Flash

Gemini 2.5 Flash 84 tok/s

Gemini 2.5 Pro 45 tok/s Pro

GPT-5 Medium 28 tok/s Pro

GPT-5 High 21 tok/s Pro

GPT-4o 92 tok/s Pro

GPT OSS 120B 425 tok/s Pro

Kimi K2 157 tok/s Pro

2000 character limit reached

A Dynamic Allocation Scheme for Adaptive Shared-Memory Mapping on Kilo-core RV Clusters for Attention-Based Model Deployment (2508.01180v1)

Published 2 Aug 2025 in cs.AR

Abstract: Attention-based models demand flexible hardware to manage diverse kernels with varying arithmetic intensities and memory access patterns. Large clusters with shared L1 memory, a common architectural pattern, struggle to fully utilize their processing elements (PEs) when scaled up due to reduced throughput in the hierarchical PE-to-L1 intra-cluster interconnect. This paper presents Dynamic Allocation Scheme (DAS), a runtime programmable address remapping hardware unit coupled with a unified memory allocator, designed to minimize data access contention of PEs onto the multi-banked L1. We evaluated DAS on an aggressively scaled-up 1024-PE RISC-V cluster with Non-Uniform Memory Access (NUMA) PE-to-L1 interconnect to demonstrate its potential for improving data locality in large parallel machine learning workloads. For a Vision Transformer (ViT)-L/16 model, each encoder layer executes in 5.67 ms, achieving a 1.94x speedup over the fixed word-level interleaved baseline with 0.81 PE utilization. Implemented in 12nm FinFET technology, DAS incurs <0.1 % area overhead.

Collections

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Paper Prompts

Explore 10 Community Prompts

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Generate Now

A Dynamic Allocation Scheme for Adaptive Shared-Memory Mapping on Kilo-core RV Clusters for Attention-Based Model Deployment (2508.01180v1)

Collections

Summary

Paper Prompts

Follow-up Questions

Related Papers

Authors (5)