Co-design of tile-based accelerator templates and dataflow for efficient LLM mapping
Establish a co-design methodology that couples tile-based accelerator template design with dataflow selection to efficiently map large language model workloads on tile-based many-PE architectures.
References
Furthermore, co-designing a tile-based accelerator template that can efficiently map LLM workloads remains an open architectural problem which is tightly coupled with dataflow selection.
— FlatAttention: Dataflow and Fabric Collectives Co-Optimization for Large Attention-Based Model Inference on Tile-Based Accelerators
(2604.02110 - Zhang et al., 2 Apr 2026) in Section 1, Introduction