Rethinking and Designing a High-performing Automatic License Plate Recognition Approach (2011.14936v2)

Published 30 Nov 2020 in cs.CV

Abstract: In this paper, we propose a real-time and accurate automatic license plate recognition (ALPR) approach. Our study illustrates the outstanding design of ALPR with four insights: (1) the resampling-based cascaded framework is beneficial to both speed and accuracy; (2) the highly efficient license plate recognition should abundant additional character segmentation and recurrent neural network (RNN), but adopt a plain convolutional neural network (CNN); (3) in the case of CNN, taking advantage of vertex information on license plates improves the recognition performance; and (4) the weight-sharing character classifier addresses the lack of training images in small-scale datasets. Based on these insights, we propose a novel ALPR approach, termed VSNet. Specifically, VSNet includes two CNNs, i.e., VertexNet for license plate detection and SCR-Net for license plate recognition, integrated in a resampling-based cascaded manner. In VertexNet, we propose an efficient integration block to extract the spatial features of license plates. With vertex supervisory information, we propose a vertex-estimation branch in VertexNet such that license plates can be rectified as the input images of SCR-Net. In SCR-Net, we introduce a horizontal encoding technique for left-to-right feature extraction and propose a weight-sharing classifier for character recognition. Experimental results show that the proposed VSNet outperforms state-of-the-art methods by more than 50% relative improvement on error rate, achieving > 99% recognition accuracy on CCPD and AOLP datasets with 149 FPS inference speed. Moreover, our method illustrates an outstanding generalization capability when evaluated on the unseen PKUData and CLPD datasets.

Citations (51)

View on Semantic Scholar

Summary

The paper presents VSNet that decouples detection and recognition using a resampling-based cascaded framework to boost both speed and accuracy.
VertexNet employs innovative residual and squeeze-and-excitation modules with vertex estimation to achieve 99.1% detection precision.
SCR-Net utilizes a weight-sharing classifier to deliver over 99.5% recognition accuracy in real-time, optimizing computational efficiency.

High-Performing Automatic License Plate Recognition (ALPR) Approach

This essay presents a comprehensive summary of the paper titled "Rethinking and Designing a High-performing Automatic License Plate Recognition Approach" (2011.14936). The paper introduces VSNet, a novel Automatic License Plate Recognition (ALPR) system designed to tackle challenges in real-time and accurate license plate detection and recognition in unconstrained environments. The system's architecture consists of two key components: VertexNet for license plate detection and SCR-Net for license plate recognition, integrated through a resampling-based cascaded framework.

System Architecture

VSNet is constructed around two convolutional neural networks: VertexNet and SCR-Net. The design philosophy emphasizes real-time performance without sacrificing accuracy. By employing a resampling-based cascaded framework, the paper decouples the detection and recognition tasks to optimize precision and inference speed.

VertexNet Detection:

Architecture: VertexNet utilizes an innovative integration block composed of residual structures and enhanced squeeze-and-excitation (SE) modules to effectively extract spatial features of license plates.
Vertex Estimation: The network incorporates a vertex-estimation branch, offering superior performance in localization by predicting the geometric shapes of license plates.
Trade-offs: The balance between model complexity and detection accuracy is achieved using a compact architecture and small-size input, ensuring efficient processing while maintaining detection robustness.

SCR-Net Recognition:

Architecture: SCR-Net implements a forward-pass CNN approach enhanced with a horizontal encoding technique tailored for left-to-right feature extraction in LPs.
Weight-sharing Classifier: This classifier is devised to address sample scarcity in small-scale datasets, offering a drastic parameter reduction compared to fully-connected classifiers while enhancing recognition accuracy.
Figure 1: Framework of the proposed VSNet. An input image is resized to a small resolution, i.e., 256x256, for fast inference in VertexNet. Then, the LP patch is resampled from the finest input image and rectified to high resolution according to the predicted vertices by VertexNet. Finally, SCR-Net recognizes all characters in the LP.

Novel Contributions and Insights

Resampling-Based Framework: VsNet separates the size requirements for detection and recognition, optimizing for speed and recognition quality by resampling from high-resolution inputs specific to each task.
Vertex Supervisory Information: By leveraging vertex information to rectify LP images, the system significantly boosts recognition performance.
Efficient Use of CNNs: The approach intentionally avoids character segmentation or RNNs, opting instead for a plain CNN structure that streamlines computational and temporal efficiency.
Figure 2: Architecture of VSNet. VertexNet consists of the backbone, fusion, and head networks, predicting the bounding boxes and vertices of LPs. SCR-Net resizes and rectifies LP images based on predicted vertices and recognizes all characters.

Experimental Evaluation

VSNet was evaluated on several widely recognized datasets: CCPD, AOLP, PKUData, and CLPD. Key performance metrics include:

Detection Precision: VertexNet achieved a high detection precision (99.1%) with outstanding speed, demonstrating its effectiveness over prior art.
Recognition Accuracy: SCR-Net surpassed state-of-the-art recognition accuracy with over 99.5% on prevalent datasets like CCPD, exhibiting both speed (11.4 ms per image) and accuracy.
Generalization Capability: The system showed robust cross-dataset performance, affirming its adaptability to unseen data conditions.
Figure 3: Qualitative results of VertexNet on the CCPD testing set. Green, blue, and red bounding boxes represent ground truth, truth positive detections, and failure detections, respectively.

Implications and Future Directions

The paper's findings underscore the significance of architectural optimization in ALPR systems to balance speed, accuracy, and resource consumption. By demonstrating substantial performance gains and efficient processing capabilities, the research enhances the application of ALPR in intelligent transport systems and beyond.

Future explorations could include enhancing character recognition under challenging conditions, such as extreme occlusion and variable light environments, potentially by integrating self-attention mechanisms. Moreover, exploring generative models that can augment limited LP datasets for training might offer further refinements in real-time applications.

Conclusion

The proposed VSNet encapsulates an efficient, high-performance ALPR system catering to real-time constraints and diverse operating environments. The deep integration of vertex information, alongside a novel weight-sharing classifier, marks a substantial advancement in the ALPR domain, promising broader application prospects in intelligent transportation systems.