Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
173 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Fast Object Detection with a Machine Learning Edge Device (2410.04173v1)

Published 5 Oct 2024 in cs.RO and cs.CV

Abstract: This machine learning study investigates a lowcost edge device integrated with an embedded system having computer vision and resulting in an improved performance in inferencing time and precision of object detection and classification. A primary aim of this study focused on reducing inferencing time and low-power consumption and to enable an embedded device of a competition-ready autonomous humanoid robot and to support real-time object recognition, scene understanding, visual navigation, motion planning, and autonomous navigation of the robot. This study compares processors for inferencing time performance between a central processing unit (CPU), a graphical processing unit (GPU), and a tensor processing unit (TPU). CPUs, GPUs, and TPUs are all processors that can be used for machine learning tasks. Related to the aim of supporting an autonomous humanoid robot, there was an additional effort to observe whether or not there was a significant difference in using a camera having monocular vision versus stereo vision capability. TPU inference time results for this study reflect a 25% reduction in time over the GPU, and a whopping 87.5% reduction in inference time compared to the CPU. Much information in this paper is contributed to the final selection of Google's Coral brand, Edge TPU device. The Arduino Nano 33 BLE Sense Tiny ML Kit was also considered for comparison but due to initial incompatibilities and in the interest of time to complete this study, a decision was made to review the kit in a future experiment.

Summary

  • The paper presents the use of a Google Coral TPU for real-time object detection, achieving a 25% reduction in inference time over GPUs and an 87.5% reduction over CPUs.
  • It leverages the YOLOv8 architecture with TensorFlow Lite and transfer learning to maintain high 80% precision while minimizing power consumption.
  • The study highlights practical implications for robotics by enabling low-cost, power-efficient systems suitable for autonomous navigation and the RoboCup competition.

Analysis of Fast Object Detection Using Edge Devices

The paper presents a thorough investigation into the application of edge devices for fast object detection using machine learning techniques. The primary objective is to enable real-time object recognition and classification, with reduced inferencing time and power consumption, thereby making edge computing viable for use in resource-constrained environments. The paper is particularly relevant for applications such as autonomous navigation and robotic competitions, where cost and efficiency are paramount considerations.

Hardware and Methodology

The comparative analysis is structured around three types of processors: Central Processing Unit (CPU), Graphics Processing Unit (GPU), and Tensor Processing Unit (TPU). The authors opt for an innovative application of the Google Coral Edge TPU device due to its superior performance in terms of inference time in comparison to traditional GPUs and CPUs. The paper evaluates the TPU's efficiency in a practical scenario involving robot soccer in the RoboCup competition, providing empirical evidence with a 25% reduction in inference time over the GPU, and an 87.5% reduction over the CPU.

The choice of hardware includes essential components like the Intel RealSense D35I Stereo Camera and a generic mono webcam. The experimentation determined that using a monocular vision setup with a TPU offers comparable object detection performance to stereo vision cameras at a lower cost, reinforcing the TPU's viability in low-power embedded systems.

Software Framework

The implementation relies on a blend of Python tools and frameworks, including OpenCV and TensorFlow, alongside the YOLOv8 architecture for object detection. The inferencing and model deployment leveraged TensorFlow Lite, facilitating the edge device integration. A noted aspect is the use of transfer learning and quantization techniques, which optimize the network's performance on the TPU while maintaining competitive accuracy and minimizing energy consumption.

Key Results

The research yields strong numerical insights, emphasizing significant time reductions in inference yields due to the utilization of the TPU. This showcases the TPU's aptitude for real-time applications in constrained environments. Precision metrics demonstrate an 80% precision rate under specific conditions, illustrating the effectiveness of the model in detecting objects like soccer balls within the context of robotic applications.

Implications and Future Directions

With the research underscoring the TPU's capabilities, it has tangible implications for future developments in AI-driven embedded systems, notably those requiring a compact form factor and power efficiency. However, this endeavor also touches upon ethical considerations, such as the risk of bias in datasets used for training models, potential privacy invasions, and the broader social impact resulting from deploying autonomous technologies extensively.

Future work, as indicated by the authors, will focus on integrating the TPU device directly with competitive humanoid robots for enhanced performance in dynamic environments like sports. This involves advancing the intersection of artificial intelligence, machine learning methodologies, and robotics, to achieve more autonomous and responsive machines.

In conclusion, the paper provides convincing evidence for the efficacy of using TPUs over traditional processors in scenarios necessitating fast and precise object detection. This work also paves the way for further exploration into optimizing low-cost, power-efficient solutions in a myriad of applications beyond the RoboCup competition, thereby contributing to the broader discourse on object detection and machine learning at the edge.