Vehicle-Rear: A New Dataset to Explore Feature Fusion for Vehicle Identification Using Convolutional Neural Networks (1911.05541v3)

Published 13 Nov 2019 in cs.CV and cs.LG

Abstract: This work addresses the problem of vehicle identification through non-overlapping cameras. As our main contribution, we introduce a novel dataset for vehicle identification, called Vehicle-Rear, that contains more than three hours of high-resolution videos, with accurate information about the make, model, color and year of nearly 3,000 vehicles, in addition to the position and identification of their license plates. To explore our dataset we design a two-stream CNN that simultaneously uses two of the most distinctive and persistent features available: the vehicle's appearance and its license plate. This is an attempt to tackle a major problem: false alarms caused by vehicles with similar designs or by very close license plate identifiers. In the first network stream, shape similarities are identified by a Siamese CNN that uses a pair of low-resolution vehicle patches recorded by two different cameras. In the second stream, we use a CNN for OCR to extract textual information, confidence scores, and string similarities from a pair of high-resolution license plate patches. Then, features from both streams are merged by a sequence of fully connected layers for decision. In our experiments, we compared the two-stream network against several well-known CNN architectures using single or multiple vehicle features. The architectures, trained models, and dataset are publicly available at https://github.com/icarofua/vehicle-rear.

Citations (20)

View on Semantic Scholar

Summary

The paper introduces the unique Vehicle-Rear dataset with over 3,000 annotated vehicle images for enhanced identification analysis.
The paper proposes a two-stream CNN architecture that fuses shape and OCR streams to deliver an impressive 98.92% F-score in vehicle re-identification.
The paper demonstrates that integrating vehicle shape and license plate text features significantly reduces false identifications in multi-camera systems.

Insights on Vehicle Identification and Feature Fusion Using CNNs

The paper presents a comprehensive paper on vehicle identification across non-overlapping camera systems, a critical task for enhancing urban surveillance, traffic management, and law enforcement. The researchers focus on reducing false identifications triggered by vehicles with similar designs or license plates, through the introduction of a novel dataset, called Vehicle-Rear, and a two-stream convolutional neural network (CNN) architecture. The Vehicle-Rear dataset is substantial, encompassing more than 3,000 vehicles annotated with make, model, color, year, and precise license plate positions.

Methodology

The researchers propose a two-stream CNN approach to exploit the dataset efficiently. This architecture is composed of:

Shape-Stream: Using a Siamese CNN architecture, shape similarities between vehicles are extracted. This twin network approach handles pairs of low-resolution vehicle images to enhance the recognition of visually analogous vehicles.
OCR-Stream: Aimed at converting high-resolution license plate images into textual information, this component leverages deep learning methods for Optical Character Recognition (OCR). The CNN adapts the CR-NET architecture, trained for Brazilian plates, to achieve this objective. The OCR process generates a detailed descriptor that includes textual and confidence scores, which helps mitigate issues with similar character sequences.

The integration of these two streams via fully connected layers allows the network to make a consolidated decision, enhancing vehicle discrimination.

Experimental Results

The prowess of the proposed model was evaluated against various well-known CNN architectures. Significant metrics from these experiments include:

Shape-only Stream Performance: The best-performing architecture for shape-only recognition (Small-VGG) resulted in an F-score of 91.35%.
OCR-only Performance: The CNN-OCR model achieved outstanding results, with a perfect match F-score reaching 94.1%.
Two-Stream Fusion Performance: Combining shape and OCR streams, the final model achieved an impressive F-score of 98.92%, highlighting the synergistic effect of integrating both streams.

The fusion of vehicle appearance and license plate information significantly enhances the reliability and accuracy of vehicle identification, even under conditions of challenging inter-class similarity.

Implications and Future Directions

The paper underscores the importance of feature fusion in vehicle identification tasks. The Vehicle-Rear dataset, with its legible license plate information, provides a unique resource for researchers addressing vehicle re-identification challenges. Practical implementations of this research could extend to intelligent transportation systems, improving public safety and efficiency in urban traffic management.

Future research directions may explore the incorporation of temporal information from video data or the adaptation of this model to other international license plate formats. Additionally, exploring more efficient network architectures that maintain or improve performance with a reduced computational footprint could facilitate real-world deployment in large-scale urban centers. Moreover, further exploration into privacy-preserving techniques for handling license plate data remains crucial, considering regional privacy regulations.

The paper highlights a robust path forward for developing intelligent surveillance systems that can seamlessly integrate into the expanding smart city infrastructures, providing timely and accurate traffic insights.

PDF Markdown

Related Papers

GitHub

GitHub - icarofua/vehicle-rear: Vehicle-Rear: A New Dataset to Explore Feature Fusion For Vehicle Identification Using Convolutional Neural Networks (111 stars)