Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

On-Device Training of Fully Quantized Deep Neural Networks on Cortex-M Microcontrollers (2407.10734v2)

Published 15 Jul 2024 in cs.LG and cs.AI

Abstract: On-device training of DNNs allows models to adapt and fine-tune to newly collected data or changing domains while deployed on microcontroller units (MCUs). However, DNN training is a resource-intensive task, making the implementation and execution of DNN training algorithms on MCUs challenging due to low processor speeds, constrained throughput, limited floating-point support, and memory constraints. In this work, we explore on-device training of DNNs for Cortex-M MCUs. We present a method that enables efficient training of DNNs completely in place on the MCU using fully quantized training (FQT) and dynamic partial gradient updates. We demonstrate the feasibility of our approach on multiple vision and time-series datasets and provide insights into the tradeoff between training accuracy, memory overhead, energy, and latency on real hardware.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Mark Deutel (4 papers)
  2. Frank Hannig (19 papers)
  3. Christopher Mutschler (59 papers)
  4. Jürgen Teich (33 papers)

Summary

We haven't generated a summary for this paper yet.