Ultra-Efficient On-Device Object Detection on AI-Integrated Smart Glasses with TinyissimoYOLO (2311.01057v2)

Published 2 Nov 2023 in cs.CV, cs.AI, and cs.RO

Abstract: Smart glasses are rapidly gaining advanced functionality thanks to cutting-edge computing technologies, accelerated hardware architectures, and tiny AI algorithms. Integrating AI into smart glasses featuring a small form factor and limited battery capacity is still challenging when targeting full-day usage for a satisfactory user experience. This paper illustrates the design and implementation of tiny machine-learning algorithms exploiting novel low-power processors to enable prolonged continuous operation in smart glasses. We explore the energy- and latency-efficient of smart glasses in the case of real-time object detection. To this goal, we designed a smart glasses prototype as a research platform featuring two microcontrollers, including a novel milliwatt-power RISC-V parallel processor with a hardware accelerator for visual AI, and a Bluetooth low-power module for communication. The smart glasses integrate power cycling mechanisms, including image and audio sensing interfaces. Furthermore, we developed a family of novel tiny deep-learning models based on YOLO with sub-million parameters customized for microcontroller-based inference dubbed TinyissimoYOLO v1.3, v5, and v8, aiming at benchmarking object detection with smart glasses for energy and latency. Evaluations on the prototype of the smart glasses demonstrate TinyissimoYOLO's 17ms inference latency and 1.59mJ energy consumption per inference while ensuring acceptable detection accuracy. Further evaluation reveals an end-to-end latency from image capturing to the algorithm's prediction of 56ms or equivalently 18 fps, with a total power consumption of 62.9mW, equivalent to a 9.3 hours of continuous run time on a 154mAh battery. These results outperform MCUNet (TinyNAS+TinyEngine), which runs a simpler task (image classification) at just 7.3 fps per second.

Citations (11)

View on Semantic Scholar

Summary

The paper introduces a sub-million parameter YOLO architecture optimized for real-time object detection on smart glasses.
It utilizes dual microcontrollers to achieve up to 18 FPS and reduce inference latency to 16.9 ms while conserving energy.
The research demonstrates significant power savings and presents open-source designs to advance edge AI in wearable devices.

Ultra-Efficient On-Device Object Detection on AI-Integrated Smart Glasses with Tinyissimo YOLO

The paper explores the development and deployment of ultra-efficient on-device object detection algorithms tailored to AI-integrated smart glasses, employing the Tinyissimo YOLO framework. The research emphasizes achieving functionality within the constraints of the small form factor and limited battery capacity of smart glasses, thus underpinning the modern drive towards edge intelligence.

System Design and Architecture

A novel smart glasses prototype was developed, integrating dual microcontrollers, namely the RISC-V-based GAP9 System-on-Chip (SoC) from Greenwaves Technologies, featuring a machine learning accelerator, and the ISM4520 SoC encompassing an nRF52 microcontroller for communication tasks. The microcontroller setup facilitates real-time on-device inference, significantly reducing power consumption by leveraging the energy-efficient processing capabilities of the GAP9.

Tinyissimo YOLO Architectures

The primary contribution lies in designing a family of sub-million parameter YOLO architectures, specifically tailored to microcontroller environments which demand constrained resources. The paper introduces several versions dubbed Tinyissimo YOLO v1.3, v5, and v8, each representing modifications of existing YOLO architectures to accommodate low memory and computation requirements while maintaining satisfactory detection accuracy. The smallest versions of these architectures reduce parameters by up to 50x compared to traditional YOLOv1 models.

Empirical Evaluation

Assessment of the prototype platform revealed that the system could achieve up to 18 frames per second (FPS) with an end-to-end latency of 56 milliseconds and comprehensive power consumption of approximately 62.9mW, thereby achieving an estimated continuous operational duration of 9.3 hours on a 154mAh battery. Notably, in Tinyissimo YOLOv1.3, the inference time was reduced to 16.9 milliseconds with an energy consumption of 1.59mJ per inference, showcasing superior efficiency when compared to conventional edge systems like MCUNet, which processes only image classification tasks at a significantly lower 7.3 FPS. The comparative evaluation against other platforms such as Sony IMX500 and Coral Micro further underscores the competitive advantage of the presented architecture in terms of power efficiency.

Theoretical and Practical Implications

This research advances the discourse on edge processing for wearable devices, particularly in object detection within AI-integrated wearables. By successfully deploying computationally demanding tasks on low-energy processors with near state-of-the-art accuracy, the paper supports the potential for enhanced user privacy, reduced latency, and extended device autonomy. Furthermore, the open-source release of the Tinyissimo YOLO implementations facilitates reproducibility and future research endeavors, supporting the propagation of AI technologies in resource-limited environments.

Future Prospects

The exploration into neural architecture search and quantization-aware training presents opportunities for further optimization and adaptations of Tinyissimo YOLO variants. Moreover, advances in sensor integration, narrowing the AI-in-sensor approaches, may continue to refine and expand the practical applications of these smart glasses platforms. These enhancements will invariably contribute towards the evolving landscape of AIoT, bridging the gap between innovative processing units and sustainable wearable technology solutions.

In summary, this work provides a comprehensive approach to effective energy-efficient smart glasses through innovative hardware and algorithmic developments, thereby contributing to the evolution of ultra-efficient edge AI systems.

PDF Markdown

Related Papers

Tweets

https://twitter.com/pulp_platform/status/1845717582441005399