CADC: Crossbar-Aware Dendritic Convolution for Efficient In-memory Computing
Abstract: Convolutional neural networks (CNNs) are computationally intensive and often accelerated using crossbar-based in-memory computing (IMC) architectures. However, large convolutional layers must be partitioned across multiple crossbars, generating numerous partial sums (psums) that require additional buffer, transfer, and accumulation, thus introducing significant system-level overhead. Inspired by dendritic computing principles from neuroscience, we propose crossbar-aware dendritic convolution (CADC), a novel approach that dramatically increases sparsity in psums by embedding a nonlinear dendritic function (zeroing negative values) directly within crossbar computations. Experimental results demonstrate that CADC significantly reduces psums, eliminating 80% in LeNet-5 on MNIST, 54% in ResNet-18 on CIFAR-10, 66% in VGG-16 on CIFAR-100, and up to 88% in spiking neural networks (SNN) on the DVS Gesture dataset. The induced sparsity from CADC provides two key benefits: (1) enabling zero-compression and zero-skipping, thus reducing buffer and transfer overhead by 29.3% and accumulation overhead by 47.9%; (2) minimizing ADC quantization noise accumulation, resulting in small accuracy degradation - only 0.01% for LeNet-5, 0.1% for ResNet-18, 0.5% for VGG-16, and 0.9% for SNN. Compared to vanilla convolution (vConv), CADC exhibits accuracy changes ranging from +0.11% to +0.19% for LeNet-5, -0.04% to -0.27% for ResNet-18, +0.99% to +1.60% for VGG-16, and -0.57% to +1.32% for SNN, across crossbar sizes from 64x64 to 256x256. Ultimately, a SRAM-based IMC implementation of CADC achieves 2.15 TOPS and 40.8 TOPS/W for ResNet-18 (4/2/4b), realizing an 11x-18x speedup and 1.9x-22.9x improvement in energy efficiency compared to existing IMC accelerators.
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.