Architectural Insights into Knowledge Distillation for Object Detection: A Comprehensive Review

Published 5 Aug 2025 in cs.CV | (2508.03317v1)

Abstract: Object detection has achieved remarkable accuracy through deep learning, yet these improvements often come with increased computational cost, limiting deployment on resource-constrained devices. Knowledge Distillation (KD) provides an effective solution by enabling compact student models to learn from larger teacher models. However, adapting KD to object detection poses unique challenges due to its dual objectives-classification and localization-as well as foreground-background imbalance and multi-scale feature representation. This review introduces a novel architecture-centric taxonomy for KD methods, distinguishing between CNN-based detectors (covering backbone-level, neck-level, head-level, and RPN/RoI-level distillation) and Transformer-based detectors (including query-level, feature-level, and logit-level distillation). We further evaluate representative methods using the MS COCO and PASCAL VOC datasets with [email protected] as performance metric, providing a comparative analysis of their effectiveness. The proposed taxonomy and analysis aim to clarify the evolving landscape of KD in object detection, highlight current challenges, and guide future research toward efficient and scalable detection systems.