PAFNet: An Efficient Anchor-Free Object Detector Guidance (2104.13534v1)

Published 28 Apr 2021 in cs.CV

Abstract: Object detection is a basic but challenging task in computer vision, which plays a key role in a variety of industrial applications. However, object detectors based on deep learning usually require greater storage requirements and longer inference time, which hinders its practicality seriously. Therefore, a trade-off between effectiveness and efficiency is necessary in practical scenarios. Considering that without constraint of pre-defined anchors, anchor-free detectors can achieve acceptable accuracy and inference speed simultaneously. In this paper, we start from an anchor-free detector called TTFNet, modify the structure of TTFNet and introduce multiple existing tricks to realize effective server and mobile solutions respectively. Since all experiments in this paper are conducted based on PaddlePaddle, we call the model as PAFNet(Paddle Anchor Free Network). For server side, PAFNet can achieve a better balance between effectiveness (42.2% mAP) and efficiency (67.15 FPS) on a single V100 GPU. For moblie side, PAFNet-lite can achieve a better accuracy of (23.9% mAP) and 26.00 ms on Kirin 990 ARM CPU, outperforming the existing state-of-the-art anchor-free detectors by significant margins. Source code is at https://github.com/PaddlePaddle/PaddleDetection.

Authors (8)

Ying Xin (12 papers)
Guanzhong Wang (34 papers)
Mingyuan Mao (6 papers)
Yuan Feng (109 papers)
Qingqing Dang (15 papers)
Yanjun Ma (29 papers)
Errui Ding (156 papers)
Shumin Han (18 papers)

Citations (9)

View on Semantic Scholar

Summary

A Formal Analysis of PAFNet: An Efficient Anchor-Free Object Detector Guidance

The paper "PAFNet: An Efficient Anchor-Free Object Detector Guidance" introduces PAFNet, an object detection framework based on the anchor-free paradigm. The approach leverages the inherent advantages of anchor-free models to address the resource-intensive nature of traditional object detectors, aiming to achieve a balance between computational efficiency and detection efficacy.

Overview of PAFNet

PAFNet is developed on the foundation of TTFNet, an existing anchor-free framework, by enhancing its architecture and incorporating various optimization strategies. The modifications cater to both server-side and mobile-side applications, offering tailored solutions that optimize for specific contexts.

Server-Side Implementation

Architecture: Utilizes ResNet50-vd as the backbone, integrated with an AGS (Attention-guided Sampling) module to enhance feature extraction.
Performance: Achieves 42.2% mAP with a frame rate of 67.15 FPS on a single V100 GPU.
Methods: The server-side model employs Semi-Supervised Learning with Distillation (SSLD), data augmentation techniques such as CutMix, and multiple training schedules. The integration of a deformable convolution network (DCN) further enhances flexibility and performance.

Mobile-Side Implementation

Architecture: Implements MobileNetV3-Large as the backbone, focusing on reducing computational overhead with a lightweight head structure.
Performance: Attains an mAP of 23.9% with a latency of 26 ms on a Kirin 990 ARM CPU.
Methods: SSLD, along with augmented methods like GridMask and strategies from PP-YOLO, contributes to its competitive performance within computational constraints.

Numerical Results and Comparative Analysis

The results indicate a remarkable improvement over TTFNet and other state-of-the-art anchor-free detectors. For server-side applications, the addition of DCN and extended training (10x scheduler) significantly boosts performance metrics. On the mobile front, the efficient use of MobileNetV3 and strategic data augmentation achieve notable accuracy enhancements.

Theoretical and Practical Implications

The advancements presented in PAFNet highlight the evolving capabilities of anchor-free detection systems. By eliminating reliance on pre-defined anchors, the framework reduces complexity and enhances generalization across various datasets and device capabilities. Practically, PAFNet's architecture is well-suited for real-time applications, particularly in scenarios where computational resources are limited.

Future Directions in AI

The evolving landscape of object detection suggests several potential research avenues. The efficiency gains demonstrated by PAFNet could be extended through further exploration of lightweight model architectures and more advanced attention mechanisms. Additionally, the integration of adaptive learning strategies and more robust augmentation techniques may foster improvements in both accuracy and generalization efficiency.

Conclusion

PAFNet represents a significant stride in object detection methodologies, particularly within the anchor-free domain. By delivering practical and efficient solutions for both server and mobile sides, the framework sets a benchmark for future developments. The research provides insight into effective strategies that could be pivotal in refining real-time detection systems, thereby supporting broader applications in industry and academia.

PDF Markdown

Related Papers

Find Related Papers

GitHub

GitHub - PaddlePaddle/PaddleDetection: Object Detection toolkit based on PaddlePaddle. It supports object detection, instance segmentation, multiple object tracking and real-time multi-person keypoint detection. (12,241 stars)