Low-Energy On-Device Personalization for MCUs (2403.08040v4)
Abstract: Microcontroller Units (MCUs) are ideal platforms for edge applications due to their low cost and energy consumption, and are widely used in various applications, including personalized machine learning tasks, where customized models can enhance the task adaptation. However, existing approaches for local on-device personalization mostly support simple ML architectures or require complex local pre-training/training, leading to high energy consumption and negating the low-energy advantage of MCUs. In this paper, we introduce $MicroT$, an efficient and low-energy MCU personalization approach. $MicroT$ includes a robust, general, but tiny feature extractor, developed through self-supervised knowledge distillation, which trains a task-specific head to enable independent on-device personalization with minimal energy and computational requirements. MicroT implements an MCU-optimized early-exit inference mechanism called stage-decision to further reduce energy costs. This mechanism allows for user-configurable exit criteria (stage-decision ratio) to adaptively balance energy cost with model performance. We evaluated MicroT using two models, three datasets, and two MCU boards. $MicroT$ outperforms traditional transfer learning (TTL) and two SOTA approaches by 2.12 - 11.60% across two models and three datasets. Targeting widely used energy-aware edge devices, MicroT's on-device training requires no additional complex operations, halving the energy cost compared to SOTA approaches by up to 2.28X while keeping SRAM usage below 1MB. During local inference, MicroT reduces energy cost by 14.17% compared to TTL across two boards and two datasets, highlighting its suitability for long-term use on energy-aware resource-constrained MCUs.
- Tinyml meets iot: A comprehensive survey. Internet of Things, 16:100461, 2021.
- A comprehensive survey on tinyml. IEEE Access, 2023.
- Machine learning for microcontroller-class hardware-a review. IEEE Sensors Journal, 2022.
- Mult: An end-to-end multitask learning transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12031–12041, 2022.
- Multitask learning for emotion and personality traits detection. Neurocomputing, 493:340–350, 2022.
- Underwater messaging using mobile devices. In Proceedings of the ACM SIGCOMM 2022 Conference, pages 545–559, 2022.
- Poster: Towards battery-free machine learning inference and model personalization on mcus. In Proceedings of the 21st Annual International Conference on Mobile Systems, Applications and Services, pages 571–572, 2023.
- Multi-task federated learning for personalised deep neural networks in edge computing. IEEE Transactions on Parallel and Distributed Systems, 33(3):630–641, 2021.
- On-edge multi-task transfer learning: Model and practice with data-driven task allocation. IEEE Transactions on Parallel and Distributed Systems, 31(6):1357–1371, 2019.
- Communication-efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics, pages 1273–1282. PMLR, 2017.
- Split learning for health: Distributed deep learning without sharing raw patient data. arXiv preprint arXiv:1812.00564, 2018.
- Yu Zhang and Qiang Yang. An overview of multi-task learning. National Science Review, 5(1):30–43, 2018.
- Time efficient federated learning with semi-asynchronous communication. In 2020 IEEE 26th International Conference on Parallel and Distributed Systems (ICPADS), pages 156–163. IEEE, 2020.
- Fedat: A high-performance and communication-efficient federated learning system with asynchronous tiers. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pages 1–16, 2021.
- Battery-free wireless imaging of underwater environments. Nature communications, 13(1):5546, 2022.
- An analytical framework for low-power underwater backscatter communications. The Journal of the Acoustical Society of America, 153(3_supplement):A376–A376, 2023.
- Fedboost: A communication-efficient algorithm for federated learning. In International Conference on Machine Learning, pages 3973–3983. PMLR, 2020.
- Communication-efficient federated learning for wireless edge intelligence in iot. IEEE Internet of Things Journal, 7(7):5986–5994, 2019.
- Unit: Multimodal multitask learning with a unified transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 1439–1449, 2021.
- What is the state of neural network pruning? Proceedings of machine learning and systems, 2:129–146, 2020.
- A survey of quantization methods for efficient neural network inference. In Low-Power Computer Vision, pages 291–326. Chapman and Hall/CRC, 2022.
- Knowledge distillation: A survey. International Journal of Computer Vision, 129:1789–1819, 2021.
- To prune, or not to prune: exploring the efficacy of pruning for model compression. arXiv preprint arXiv:1710.01878, 2017.
- Towards deploying dnn models on edge for predictive maintenance applications. Electronics, 12(3):639, 2023.
- Lightweight run-time working memory compression for deployment of deep neural networks on resource-constrained mcus. In Proceedings of the 26th Asia and South Pacific Design Automation Conference, pages 607–614, 2021.
- 512kib ram is enough! live camera face recognition dnn on mcu. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, pages 0–0, 2019.
- Self-supervised learning: Generative or contrastive. IEEE transactions on knowledge and data engineering, 35(1):857–876, 2021.
- Dinov2: Learning robust visual features without supervision. arXiv preprint arXiv:2304.07193, 2023.
- Cnos: A strong baseline for cad-based novel object segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 2134–2140, 2023.
- An ai dietitian for type 2 diabetes mellitus management based on large language and image recognition models: Preclinical concept validation study. Journal of Medical Internet Research, 25:e51300, 2023.
- Pruning algorithms to accelerate convolutional neural networks for edge applications: A survey. arXiv preprint arXiv:2005.04275, 2020.
- Multi-level logit distillation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 24276–24285, 2023.
- Knowledge distillation classifier generation network for zero-shot learning. IEEE Transactions on Neural Networks and Learning Systems, 2021.
- Knowledge distillation via instance-level sequence learning. Knowledge-Based Systems, 233:107519, 2021.
- Knowledge distillation meets self-supervision. In European Conference on Computer Vision, pages 588–604. Springer, 2020.
- Benefits of jointly training autoencoders: An improved neural tangent kernel analysis. IEEE Transactions on Information Theory, 67(7):4669–4692, 2021.
- Iasonas Kokkinos. Ubernet: Training a universal convolutional neural network for low-, mid-, and high-level vision using diverse datasets and limited memory. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 6129–6138, 2017.
- Collaborative joint training with multitask recurrent model for speech and speaker recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 25(3):493–504, 2016.
- Which tasks should be learned together in multi-task learning? In International Conference on Machine Learning, pages 9120–9132. PMLR, 2020.
- A comprehensive survey on transfer learning. Proceedings of the IEEE, 109(1):43–76, 2020.
- Tinyfedtl: Federated transfer learning on ubiquitous tiny iot devices. In 2022 IEEE International Conference on Pervasive Computing and Communications Workshops and other Affiliated Events (PerCom Workshops), pages 79–81. IEEE, 2022.
- A real-time patient-specific sleeping posture recognition system using pressure sensitive conductive sheet and transfer learning. IEEE Sensors Journal, 21(5):6869–6879, 2020.
- Emo: Real-time emotion recognition from single-eye images for resource-constrained eyewear devices. In Proceedings of the 18th International Conference on Mobile Systems, Applications, and Services, pages 448–461, 2020.
- Abdallah Wagih Ibrahim AashiDutt. Sea animals image dataset. https://www.kaggle.com/datasets/vencerlanz09/sea-animals-image-dataste, 2022.
- The oxford-iiit pet dataset.
- Towards battery-free machine learning and inference in underwater environments. In Proceedings of the 23rd Annual International Workshop on Mobile Computing Systems and Applications, pages 29–34, 2022.
- Harvnet: Resource-optimized operation of multi-exit deep neural networks on energy harvesting devices. In Proceedings of the 21st Annual International Conference on Mobile Systems, Applications and Services, pages 42–55, 2023.
- Big self-supervised models advance medical image classification. In Proceedings of the IEEE/CVF international conference on computer vision, pages 3478–3488, 2021.
- Training data-efficient image transformers & distillation through attention. In International conference on machine learning, pages 10347–10357. PMLR, 2021.
- A simple framework for contrastive learning of visual representations. In International conference on machine learning, pages 1597–1607. PMLR, 2020.
- Frill: A non-semantic speech embedding for mobile devices. arXiv preprint arXiv:2011.04609, 2020.
- Amc: Automl for model compression and acceleration on mobile devices. In Proceedings of the European conference on computer vision (ECCV), pages 784–800, 2018.
- Mnasnet: Platform-aware neural architecture search for mobile. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2820–2828, 2019.
- STMicroelectronics. X-cube-ai. 2023.
- Monsoon Solutions Inc. Monsoon high voltage power monitor. https://www.msoon.com/.
- ProxylessNAS: Direct neural architecture search on target task and hardware. In International Conference on Learning Representations, 2019.
- Mcunet: Tiny deep learning on iot devices. Advances in Neural Information Processing Systems, 33, 2020.
- Enabling imagenet-scale deep learning on mcus for accurate and efficient inference. IEEE Internet of Things Journal, 2023.
- Entropy-driven mixed-precision quantization for deep network design. Advances in Neural Information Processing Systems, 35:21508–21520, 2022.
- Tinytrain: Deep neural network training at the extreme edge. arXiv preprint arXiv:2307.09988, 2023.
- On-device training under 256kb memory. Advances in Neural Information Processing Systems, 35:22941–22954, 2022.
- The caltech-ucsd birds-200-2011 dataset. Technical Report CNS-TR-2011-001, California Institute of Technology, 2011.
- Plant identification based on noisy web data: the amazing performance of deep learning (lifeclef 2017). CEUR Workshop Proceedings, 2017.
- STMicroelectronics. Stm32cubeide. 2023.
- Squeezenet: Alexnet-level accuracy with 50x fewer parameters and¡ 0.5 mb model size. arXiv preprint arXiv:1602.07360, 2016.
- Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861, 2017.
- Human activity recognition on microcontrollers with quantized and adaptive deep neural networks. ACM Transactions on Embedded Computing Systems (TECS), 21(4):1–28, 2022.
- Sensor data collection and irrigation control on vegetable crop using smart phone and wireless sensor networks for smart farm. In 2014 IEEE Conference on Wireless Sensors (ICWiSE), pages 106–112. IEEE, 2014.
- Am33{}^{3}start_FLOATSUPERSCRIPT 3 end_FLOATSUPERSCRIPTnet: Adaptive mutual-learning-based multimodal data fusion network. IEEE Transactions on Circuits and Systems for Video Technology, 32(8):5411–5426, 2022.
- Cosmo: contrastive fusion learning with small data for multimodal human activity recognition. In Proceedings of the 28th Annual International Conference on Mobile Computing And Networking, pages 324–337, 2022.