Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

ONNXim: A Fast, Cycle-level Multi-core NPU Simulator (2406.08051v1)

Published 12 Jun 2024 in cs.AR and cs.PF

Abstract: As DNNs are widely adopted in various application domains while demanding increasingly higher compute and memory requirements, designing efficient and performant NPUs (Neural Processing Units) is becoming more important. However, existing architectural NPU simulators lack support for high-speed simulation, multi-core modeling, multi-tenant scenarios, detailed DRAM/NoC modeling, and/or different deep learning frameworks. To address these limitations, this work proposes ONNXim, a fast cycle-level simulator for multi-core NPUs in DNN serving systems. It takes DNN models represented in the ONNX graph format generated from various deep learning frameworks for ease of simulation. In addition, based on the observation that typical NPU cores process tensor tiles from on-chip scratchpad memory with deterministic compute latency, we forgo a detailed modeling for the computation while still preserving simulation accuracy. ONNXim also preserves dependencies between compute and tile DMAs. Meanwhile, the DRAM and NoC are modeled in cycle-level to properly model contention among multiple cores that can execute different DNN models for multi-tenancy. Consequently, ONNXim is significantly faster than existing simulators (e.g., by up to 384x over Accel-sim) and enables various case studies, such as multi-tenant NPUs, that were previously impractical due to slow speed and/or lack of functionalities. ONNXim is publicly available at https://github.com/PSAL-POSTECH/ONNXim.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Hyungkyu Ham (3 papers)
  2. Wonhyuk Yang (3 papers)
  3. Yunseon Shin (2 papers)
  4. Okkyun Woo (2 papers)
  5. Guseul Heo (6 papers)
  6. Sangyeop Lee (13 papers)
  7. Jongse Park (14 papers)
  8. Gwangsun Kim (4 papers)
Citations (2)

Summary

We haven't generated a summary for this paper yet.