Efficient On-device Training via Gradient Filtering (2301.00330v2)

Published 1 Jan 2023 in cs.CV, cs.AI, and cs.LG

Abstract: Despite its importance for federated learning, continuous learning and many other applications, on-device training remains an open problem for EdgeAI. The problem stems from the large number of operations (e.g., floating point multiplications and additions) and memory consumption required during training by the back-propagation algorithm. Consequently, in this paper, we propose a new gradient filtering approach which enables on-device CNN model training. More precisely, our approach creates a special structure with fewer unique elements in the gradient map, thus significantly reducing the computational complexity and memory consumption of back propagation during training. Extensive experiments on image classification and semantic segmentation with multiple CNN models (e.g., MobileNet, DeepLabV3, UPerNet) and devices (e.g., Raspberry Pi and Jetson Nano) demonstrate the effectiveness and wide applicability of our approach. For example, compared to SOTA, we achieve up to 19$\times$ speedup and 77.1% memory savings on ImageNet classification with only 0.1% accuracy loss. Finally, our method is easy to implement and deploy; over 20$\times$ speedup and 90% energy savings have been observed compared to highly optimized baselines in MKLDNN and CUDNN on NVIDIA Jetson Nano. Consequently, our approach opens up a new direction of research with a huge potential for on-device training.

Citations (14)

View on Semantic Scholar

Summary

The paper introduces gradient filtering to simplify gradient maps by averaging patches, significantly reducing computational complexity and memory usage.
Experimental results show up to 19% speedup and 77.1% memory savings on ImageNet with only a 0.1% accuracy loss.
The technique enables efficient on-device training across various neural networks without requiring changes to the model architecture.

Efficient On-device Training via Gradient Filtering

Introduction to Gradient Filtering

On-device training faces significant challenges due to the computational and memory complexity of the back-propagation algorithm, especially when considering the resource constraints of edge devices. A new method, dubbed gradient filtering, presents a novel way to address these concerns by creating a special structure within the gradient map, thus reducing both the computational complexity and memory consumption associated with back propagation during training. This approach trades off gradient precision against computational complexity, offering a potentially transformative solution for efficient on-device model training.

Gradient Filtering Methodology

The crux of the gradient filtering approach lies in reducing the computational footprint of the backward pass through convolution layers - a notorious bottleneck in on-device training due to its extensive computational and memory demands. The method achieves this by spatially segmenting the gradient map into patches, within which elements are averaged to a single unique value. Consequently, this results not only in a significant reduction of unique elements but also imparts a structured simplicity to the gradient map, facilitating cheaper operations and lower memory requirements. These approximations introduce a controllable error, the impact of which on model accuracy has been rigorously analyzed and shown to be negligible in most cases.

Experimental Results and Analysis

Through extensive experimentation across multiple CNN models and computer vision tasks, gradient filtering demonstrated promising outcomes:

Achieved up to 19% speedup and 77.1% memory savings on ImageNet classification with only a 0.1% accuracy loss.
Offered over 20% speedup and 90% energy savings compared to optimized baseline methods in real-life deployment scenarios on devices like NVIDIA Jetson Nano.
Demonstrated wide applicability and effectiveness across various neural networks and tasks without the need to adjust the architecture or freeze layers.

Implications and Future Directions

The gradient filtering approach marks a significant step forward in on-device training, particularly for EdgeAI. By addressing the computational and memory complexities inherent in back-propagation, this method opens up new avenues for research and practical applications of AI directly on edge devices. Its ease of implementation and deployment, coupled with substantial performance gains, presents a compelling case for its adoption in real-world scenarios. Moving forward, it's plausible to envision not only a broader application of this technique across different domains but also further optimizations that could minimize accuracy trade-offs or even enhance model performance.

Contributions and Acknowledgments

This paper stands out by proposing a method that effectively balances the trade-off between computational complexity and gradient precision, enabling more efficient training on resource-constrained devices. The authors' contribution extends beyond a novel algorithm; it includes a rigorous error analysis and thorough empirical evaluation across several benchmarks, proving the practicality and effectiveness of their approach. This work was supported in part by the US National Science Foundation grant, illustrating the importance of funding cutting-edge research in machine learning and AI.

In conclusion, gradient filtering emerges as a powerful technique for enhancing the feasibility of on-device training. Its simplicity, combined with the significant reductions in computational and memory requirements, makes it an attractive proposition for advancing EdgeAI. As the demand for efficient, on-device AI solutions grows, techniques like gradient filtering will undoubtedly play a pivotal role in making these technologies accessible and practical for a wide range of applications.

PDF Markdown

Related Papers

YouTube

Show All Videos