Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Keyword Spotting System and Evaluation of Pruning and Quantization Methods on Low-power Edge Microcontrollers (2208.02765v1)

Published 4 Aug 2022 in cs.SD, cs.AI, cs.AR, and eess.AS

Abstract: Keyword spotting (KWS) is beneficial for voice-based user interactions with low-power devices at the edge. The edge devices are usually always-on, so edge computing brings bandwidth savings and privacy protection. The devices typically have limited memory spaces, computational performances, power and costs, for example, Cortex-M based microcontrollers. The challenge is to meet the high computation and low-latency requirements of deep learning on these devices. This paper firstly shows our small-footprint KWS system running on STM32F7 microcontroller with Cortex-M7 core @216MHz and 512KB static RAM. Our selected convolutional neural network (CNN) architecture has simplified number of operations for KWS to meet the constraint of edge devices. Our baseline system generates classification results for each 37ms including real-time audio feature extraction part. This paper further evaluates the actual performance for different pruning and quantization methods on microcontroller, including different granularity of sparsity, skipping zero weights, weight-prioritized loop order, and SIMD instruction. The result shows that for microcontrollers, there are considerable challenges for accelerate unstructured pruned models, and the structured pruning is more friendly than unstructured pruning. The result also verified that the performance improvement for quantization and SIMD instruction.

Citations (5)

Summary

We haven't generated a summary for this paper yet.