YOLO-LITE: A Real-Time Object Detection Algorithm Optimized for Non-GPU Computers (1811.05588v1)

Published 14 Nov 2018 in cs.CV

Abstract: This paper focuses on YOLO-LITE, a real-time object detection model developed to run on portable devices such as a laptop or cellphone lacking a Graphics Processing Unit (GPU). The model was first trained on the PASCAL VOC dataset then on the COCO dataset, achieving a mAP of 33.81% and 12.26% respectively. YOLO-LITE runs at about 21 FPS on a non-GPU computer and 10 FPS after implemented onto a website with only 7 layers and 482 million FLOPS. This speed is 3.8x faster than the fastest state of art model, SSD MobilenetvI. Based on the original object detection algorithm YOLOV2, YOLO- LITE was designed to create a smaller, faster, and more efficient model increasing the accessibility of real-time object detection to a variety of devices.

PDF Abstract

YOLO-LITE: A Real-Time Solution for Non-GPU Object Detection

The paper presents "YOLO-LITE," an adaptation of the YOLO (You Only Look Once) framework designed to enable real-time object detection on non-GPU devices such as mobile phones and laptops. This model aims to extend the reach of sophisticated object detection algorithms by crafting a lightweight architecture that can function efficiently on computationally limited platforms.

Core Contributions

The primary objective of YOLO-LITE is to develop a more accessible object detection model without sacrificing performance significantly. The model, inspired by YOLOv2, offers notable adaptations:

Shallow Architecture: By reducing the model complexity to seven layers with 482 million FLOPS, YOLO-LITE achieves improved speed on non-GPU devices, operating at 21 FPS. This is a substantial increase over comparable models, such as SSD MobilenetvI, which it surpasses in speed by a factor of approximately 3.8.
Batch Normalization Observations: The paper posits that batch normalization, though advantageous for deeper networks, becomes a hindrance in shallower models like YOLO-LITE. Removing it resulted in increased processing speed from 9.5 to 21 FPS without greatly impacting accuracy.
Real-Time Implementation: YOLO-LITE extends real-time detection capabilities to web platforms and mobile devices, running at approximately 10 FPS even when integrated into a web-based application.

Methodology

YOLO-LITE was subjected to rigorous experimentation with different architecture variants, fine-tuning layer counts, and filter settings to balance speed and accuracy. Evaluations were conducted using the PASCAL VOC and COCO datasets, with the network achieving a mean Average Precision (mAP) of 33.81% and 12.26% respectively. The iterative process involved:

Comparing results from modifications in image input size and layer configurations.
Analyzing the impact of removing batch normalization on both speed and performance metrics.
Exploring architectural changes to optimize neural network pathways for reduced computational demand.

Results and Implications

Experiments illustrated that YOLO-LITE could achieve realistic object detection performance levels on non-GPU systems. Despite some trade-offs in mAP compared to more robust algorithms, the significant enhancement in processing speed indicates its potential for practical applications where computational power is a constraint.

This paper opens discourse on a few key implications:

Practical Applications: YOLO-LITE is poised to impact fields requiring real-time detection without dedicated hardware acceleration, such as mobile computing and embedded systems.
Further Research: While the model’s speed is noteworthy, improving its accuracy remains crucial. Future work could explore more sophisticated pruning techniques, integration of group convolutions, and novel optimization algorithms that could enhance mAP while preserving efficiency.
Theoretical Insights: The removal of batch normalization in YOLO-LITE challenges traditional deep learning paradigms on its necessity in smaller models, suggesting the need for further empirical studies into optimizing training techniques for lightweight architectures.

Conclusion

YOLO-LITE offers a promising contribution to the domain of object detection, especially for developers seeking efficient implementations without the computational heft of a full-sized YOLO variant. Its development highlights an important trend towards adaptable, versatile machine learning solutions that facilitate broader accessibility and application of AI. As the demand for real-time, mobile-capable AI grows, models like YOLO-LITE will undoubtedly pave the way for more inclusive technological ecosystems.

PDF Markdown Bookmark Chat (Pro)

Authors (2)

Jonathan Pedoeem (1 paper)
Rachel Huang (1 paper)

Citations (472)

View on Semantic Scholar