3DGauCIM: Accelerating Static/Dynamic 3D Gaussian Splatting via Digital CIM for High Frame Rate Real-Time Edge Rendering (2507.19133v1)
Abstract: Dynamic 3D Gaussian splatting (3DGS) extends static 3DGS to render dynamic scenes, enabling AR/VR applications with moving objects. However, implementing dynamic 3DGS on edge devices faces challenges: (1) Loading all Gaussian parameters from DRAM for frustum culling incurs high energy costs. (2) Increased parameters for dynamic scenes elevate sorting latency and energy consumption. (3) Limited on-chip buffer capacity with higher parameters reduces buffer reuse, causing frequent DRAM access. (4) Dynamic 3DGS operations are not readily compatible with digital compute-in-memory (DCIM). These challenges hinder real-time performance and power efficiency on edge devices, leading to reduced battery life or requiring bulky batteries. To tackle these challenges, we propose algorithm-hardware co-design techniques. At the algorithmic level, we introduce three optimizations: (1) DRAM-access reduction frustum culling to lower DRAM access overhead, (2) Adaptive tile grouping to enhance on-chip buffer reuse, and (3) Adaptive interval initialization Bucket-Bitonic sort to reduce sorting latency. At the hardware level, we present a DCIM-friendly computation flow that is evaluated using the measured data from a 16nm DCIM prototype chip. Our experimental results on Large-Scale Real-World Static/Dynamic Datasets demonstrate the ability to achieve high frame rate real-time rendering exceeding 200 frame per second (FPS) with minimal power consumption, merely 0.28 W for static Large-Scale Real-World scenes and 0.63 W for dynamic Large-Scale Real-World scenes. This work successfully addresses the significant challenges of implementing static/dynamic 3DGS technology on resource-constrained edge devices.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.