Efficient Edge Deployment of Quantized YOLOv4-Tiny for Aerial Emergency Object Detection on Raspberry Pi 5 (2506.09300v1)

Published 10 Jun 2025 in cs.CV

Abstract: This paper presents the deployment and performance evaluation of a quantized YOLOv4-Tiny model for real-time object detection in aerial emergency imagery on a resource-constrained edge device the Raspberry Pi 5. The YOLOv4-Tiny model was quantized to INT8 precision using TensorFlow Lite post-training quantization techniques and evaluated for detection speed, power consumption, and thermal feasibility under embedded deployment conditions. The quantized model achieved an inference time of 28.2 ms per image with an average power consumption of 13.85 W, demonstrating a significant reduction in power usage compared to its FP32 counterpart. Detection accuracy remained robust across key emergency classes such as Ambulance, Police, Fire Engine, and Car Crash. These results highlight the potential of low-power embedded AI systems for real-time deployment in safety-critical emergency response applications.

Summary

The paper demonstrates efficient edge deployment of an INT8 quantized YOLOv4-Tiny model on Raspberry Pi 5, achieving 28.2 ms per image inference and 43.9% lower power consumption.
The methodology involves converting Darknet weights to TensorFlow Lite format, reducing model size from 22.5 MB to 6.4 MB for robust aerial emergency detection.
The study’s evaluation confirms a 36% speed improvement, underscoring the model’s practical viability for real-time emergency response on low-power embedded devices.

Efficient Edge Deployment of Quantized YOLOv4-Tiny for Aerial Emergency Object Detection on Raspberry Pi 5

This paper by Sindhu Boddu and Dr. Arindam Mukherjee from the University of North Carolina at Charlotte explores the deployment of an INT8 quantized YOLOv4-Tiny model on a Raspberry Pi 5 for real-time aerial emergency object detection. The research demonstrates the integration of deep learning algorithms onto low-power embedded systems, highlighting advancements in edge AI deployment.

Summary of Contributions

The paper presents a detailed methodology for implementing a quantized YOLOv4-Tiny network using TensorFlow Lite, focusing on specific emergency object detection from aerial imagery, such as ambulances, police vehicles, and incidents like car crashes. The employment of INT8 precision quantization stands central to achieving high-efficiency inference on a resource-constrained platform. Notably, the quantized model registers an inference time of 28.2 ms per image with an average power consumption of 13.85 W. This marks a substantial power reduction compared to the FP32 counterpart, thus affirming the practicality of deploying real-time detection systems in safety-critical applications where swift reaction is necessary.

Methodology and Evaluation

The methodology encompasses dataset preparation, model training, conversion to TensorFlow format, and quantization. 10,820 annotated aerial images, representative of emergency scenarios, form the core of training datasets. The deployment pipeline involves several stages, including conversion from Darknet weights to TensorFlow Lite format, followed by static post-training quantization. This approach shrinks model size from 22.5 MB to 6.4 MB, significantly reducing computational demands and enabling deployment on hardware like Raspberry Pi 5.

Evaluation metrics were robust, considering factors such as inference time, power consumption, and detection accuracy. The YOLOv4-Tiny model achieved a speed improvement of 36%, reducing power consumption by approximately 43.9% compared to its FP32 variant. These metrics underscore the model’s deployment viability under constrained circumstances.

Implications

Deploying AI models in edge environments presents promising implications for augmenting emergency response mechanisms. By facilitating early detection and accurate identification of crisis entities, this approach could improve response times and resource allocation during emergencies. The capabilities of quantized AI models on embedded systems like Raspberry Pi 5 signal potential advancements in surveillance, traffic management, and disaster readiness.

Future Directions

The paper suggests future work in enhancing model deployment strategies, including real-time video processing capability, thermal performance monitoring under prolonged loads, and comparative studies on alternative embedded platforms. Additionally, the exploration of hybrid optimization techniques, such as model pruning and distillation, could further augment efficiency. Integration with alert systems remains a practical avenue, enabling automated alerts to emergency responders, thereby streamlining intervention processes.

Conclusion

This work effectively demonstrates the feasibility of deploying quantized deep learning models for emergency object detection on resource-limited devices. The paper provides a structured approach for achieving efficient edge AI operation, crucial for deploying real-time detection systems in critical applications. While the current results are promising, further research in real-world environments, along with iterative improvements in model efficiency and system integration, can enhance the scope and impact of AI across emergency management domains.

The insights from this research pave the way for transforming edge deployment strategies, reducing dependencies on high-power computing infrastructures, and fostering advancements in AI-driven emergency response operations.