- The paper introduces a sub-million parameter YOLO architecture optimized for real-time object detection on smart glasses.
- It utilizes dual microcontrollers to achieve up to 18 FPS and reduce inference latency to 16.9 ms while conserving energy.
- The research demonstrates significant power savings and presents open-source designs to advance edge AI in wearable devices.
Ultra-Efficient On-Device Object Detection on AI-Integrated Smart Glasses with Tinyissimo YOLO
The paper explores the development and deployment of ultra-efficient on-device object detection algorithms tailored to AI-integrated smart glasses, employing the Tinyissimo YOLO framework. The research emphasizes achieving functionality within the constraints of the small form factor and limited battery capacity of smart glasses, thus underpinning the modern drive towards edge intelligence.
System Design and Architecture
A novel smart glasses prototype was developed, integrating dual microcontrollers, namely the RISC-V-based GAP9 System-on-Chip (SoC) from Greenwaves Technologies, featuring a machine learning accelerator, and the ISM4520 SoC encompassing an nRF52 microcontroller for communication tasks. The microcontroller setup facilitates real-time on-device inference, significantly reducing power consumption by leveraging the energy-efficient processing capabilities of the GAP9.
Tinyissimo YOLO Architectures
The primary contribution lies in designing a family of sub-million parameter YOLO architectures, specifically tailored to microcontroller environments which demand constrained resources. The paper introduces several versions dubbed Tinyissimo YOLO v1.3, v5, and v8, each representing modifications of existing YOLO architectures to accommodate low memory and computation requirements while maintaining satisfactory detection accuracy. The smallest versions of these architectures reduce parameters by up to 50x compared to traditional YOLOv1 models.
Empirical Evaluation
Assessment of the prototype platform revealed that the system could achieve up to 18 frames per second (FPS) with an end-to-end latency of 56 milliseconds and comprehensive power consumption of approximately 62.9mW, thereby achieving an estimated continuous operational duration of 9.3 hours on a 154mAh battery. Notably, in Tinyissimo YOLOv1.3, the inference time was reduced to 16.9 milliseconds with an energy consumption of 1.59mJ per inference, showcasing superior efficiency when compared to conventional edge systems like MCUNet, which processes only image classification tasks at a significantly lower 7.3 FPS. The comparative evaluation against other platforms such as Sony IMX500 and Coral Micro further underscores the competitive advantage of the presented architecture in terms of power efficiency.
Theoretical and Practical Implications
This research advances the discourse on edge processing for wearable devices, particularly in object detection within AI-integrated wearables. By successfully deploying computationally demanding tasks on low-energy processors with near state-of-the-art accuracy, the paper supports the potential for enhanced user privacy, reduced latency, and extended device autonomy. Furthermore, the open-source release of the Tinyissimo YOLO implementations facilitates reproducibility and future research endeavors, supporting the propagation of AI technologies in resource-limited environments.
Future Prospects
The exploration into neural architecture search and quantization-aware training presents opportunities for further optimization and adaptations of Tinyissimo YOLO variants. Moreover, advances in sensor integration, narrowing the AI-in-sensor approaches, may continue to refine and expand the practical applications of these smart glasses platforms. These enhancements will invariably contribute towards the evolving landscape of AIoT, bridging the gap between innovative processing units and sustainable wearable technology solutions.
In summary, this work provides a comprehensive approach to effective energy-efficient smart glasses through innovative hardware and algorithmic developments, thereby contributing to the evolution of ultra-efficient edge AI systems.