Multi-spectral Vehicle Re-identification with Cross-directional Consistency Network and a High-quality Benchmark (2208.00632v2)
Abstract: To tackle the challenge of vehicle re-identification (Re-ID) in complex lighting environments and diverse scenes, multi-spectral sources like visible and infrared information are taken into consideration due to their excellent complementary advantages. However, multi-spectral vehicle Re-ID suffers cross-modality discrepancy caused by heterogeneous properties of different modalities as well as a big challenge of the diverse appearance with different views in each identity. Meanwhile, diverse environmental interference leads to heavy sample distributional discrepancy in each modality. In this work, we propose a novel cross-directional consistency network to simultaneously overcome the discrepancies from both modality and sample aspects. In particular, we design a new cross-directional center loss to pull the modality centers of each identity close to mitigate cross-modality discrepancy, while the sample centers of each identity close to alleviate the sample discrepancy. Such strategy can generate discriminative multi-spectral feature representations for vehicle Re-ID. In addition, we design an adaptive layer normalization unit to dynamically adjust individual feature distribution to handle distributional discrepancy of intra-modality features for robust learning. To provide a comprehensive evaluation platform, we create a high-quality RGB-NIR-TIR multi-spectral vehicle Re-ID benchmark (MSVR310), including 310 different vehicles from a broad range of viewpoints, time spans and environmental complexities. Comprehensive experiments on both created and public datasets demonstrate the effectiveness of the proposed approach comparing to the state-of-the-art methods.
- Multi-feature, multi-modal, and multi-source social event detection: A comprehensive survey. Information Fusion 79, 279–308.
- Pedestrian re-identification algorithm based on visual attention-positive sample generation network deep learning model. Information Fusion 86-87, 136–145.
- Layer normalization. arXiv preprint arXiv:1607.06450 .
- Re-identification with rgb-d sensors, in: Proc. European Conference on Computer Vision Workshops.
- Deep meta metric learning, in: Proc. IEEE/CVF International Conference on Computer Vision.
- Shape-former: Bridging cnn and transformer via shapeconv for multimodal image matching. Information Fusion 91, 445–457.
- Vehicle re-identification with viewpoint-aware metric learning, in: Proc. IEEE/CVF International Conference on Computer Vision, pp. 8281–8290.
- Imagenet: A large-scale hierarchical image database, in: Proc. IEEE/CVF International Conference on Computer Vision.
- An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 .
- A cross-modal distillation network for person re-identification in rgb-depth. arXiv preprint arXiv:1810.11641 .
- Learning coarse-to-fine structured feature embedding for vehicle re-identification, in: Proc. AAAI Conference on Artificial Intelligence.
- Part-regularized near-duplicate vehicle re-identification, in: Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3992–4000.
- Deep residual learning for image recognition, pp. 770–778.
- Transreid: Transformer-based object re-identification, in: Proc. IEEE/CVF International Conference on Computer Vision, pp. 15013–15022.
- In defense of the triplet loss for person re-identification. arXiv preprint arXiv:1703.07737 .
- Squeeze-and-excitation networks. IEEE Transactions on Pattern Analysis and Machine Intelligence 42, 2011–2023.
- Densely connected convolutional networks, in: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708.
- Deep learning for visible-infrared cross-modality person re-identification: A comprehensive review. Information Fusion 91, 396–411.
- A strong baseline for vehicle re-identification, in: Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 4142–4149.
- Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 .
- A dual-path model with adaptive attention for vehicle re-identification, in: Proc. IEEE/CVF International Conference on Computer Vision, pp. 6131–6140.
- Adam: A method for stochastic optimization. CoRR abs/1412.6980.
- Learning collaborative sparse representation for grayscale-thermal tracking. IEEE Transactions on Image Processing 25, 5743–5756.
- Infrared-visible cross-modal person re-identification with an x modality, in: Proc. AAAI Conference on Artificial Intelligence, pp. 4610–4617.
- Attribute and state guided structural embedding network for vehicle re-identification. IEEE transactions on image processing 31, 5949–5962.
- Multi-spectral vehicle re-identification: A challenge., in: Proc. AAAI Conference on Artificial Intelligence, pp. 11345–11353.
- Class-aware modality mix and center-guided metric learning for visible-thermal person re-identification, in: Proc. ACM International Conference on Multimedia, pp. 889–897.
- Deep relative distance learning: Tell the difference between similar vehicles, in: Proc. IEEE Conference on Computer Vision and Pattern Recognition.
- Learning memory-augmented unidirectional metrics for cross-modality person re-identification, in: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 19344–19353. doi:10.1109/CVPR52688.2022.01876.
- Large-scale vehicle re-identification in urban surveillance videos, in: Proc. IEEE International Conference on Multimedia and Expo, pp. 1–6.
- Embedding adversarial learning for vehicle re-identification. IEEE Transactions on Image Processing 28, 3794–3807.
- Veri-wild: A large dataset and a new method for vehicle re-identification in the wild, in: Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3230–3238.
- Rgbt tracking via multi-adapter network with hierarchical divergence loss. IEEE Transactions on Image Processing 30, 5613–5625.
- Cross-modality person re-identification with shared-specific feature transfer, in: Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13376–13386.
- A strong baseline and batch normalization neck for deep person re-identification. IEEE Transactions on Multimedia 22, 2597–2609.
- Visualizing data using t-sne. Journal of Machine Learning Research 9, 2579–2605.
- Parsing-based view-aware embedding network for vehicle re-identification, in: Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7101–7110.
- Tri-modal person re-identification with rgb, depth and thermal features, in: Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 301–307.
- 3d reconstruction of freely moving persons for re-identification with a depth sensor, in: Proc. IEEE International Conference on Robotics and Automation, pp. 4512–4519.
- Person recognition system based on a combination of body images from visible light and thermal cameras. Sensors (Basel, Switzerland) 17.
- Learning by aligning: Visible-infrared person re-identification using cross-modal correspondences, in: Proc. IEEE/CVF International Conference on Computer Vision, pp. 12046–12055.
- Mobilenetv2: Inverted residuals and linear bottlenecks, in: Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4510–4520.
- Learning deep neural networks for vehicle re-id with visual-spatio-temporal path proposals, in: Proc. IEEE International Conference on Computer Vision, pp. 1918–1927.
- Circle loss: A unified perspective of pair similarity optimization, in: Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6397–6406.
- Beyond part models: Person retrieval with refined part pooling, in: Proc. European Conference on Computer Vision.
- Rethinking the inception architecture for computer vision, in: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826.
- Cityflow: A city-scale benchmark for multi-target multi-camera vehicle tracking and re-identification, in: Proc. IEEE/CVF Internaltional Conference on Computer Vision and Pattern Recognition, pp. 8789–8798.
- Multi-interactive dual-decoder for rgb-thermal salient object detection. IEEE Transactions on Image Processing 30, 5678–5691.
- M5l: Multi-modal multi-margin metric learning for rgbt tracking. IEEE Transactions on Image Processing 31, 85–98.
- Instance normalization: The missing ingredient for fast stylization. arXiv preprint arXiv:1607.08022 .
- Learning discriminative features with multiple granularities for person re-identification, in: Proc. ACM International Conference on Multimedia.
- Rgb-infrared cross-modality person re-identification via joint pixel and feature alignment, in: Proc. IEEE/CVF International Conference on Computer Vision, pp. 3622–3631.
- Orientation invariant feature embedding and spatial temporal regularization for vehicle re-identification, in: Proc. IEEE International Conference on Computer Vision, pp. 379–387.
- Flexible body partition-based adversarial learning for visible infrared person re-identification. IEEE Transactions on Neural Networks and Learning Systems 33, 4676–4687.
- Syncretic modality collaborative learning for visible infrared person re-identification, in: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 225–234.
- Rbdf: Reciprocal bidirectional framework for visible infrared person reidentification. IEEE Transactions on Cybernetics 52, 10988–10998.
- Dynamic clustering of multi-modal sensor networks in urban scenarios. Information Fusion 15, 130–140.
- A discriminative feature learning approach for deep face recognition, in: Proc. European Conference on Computer Vision.
- Robust depth-based person re-identification. IEEE Transactions on Image Processing 26, 2588–2603.
- Rgb-infrared cross-modality person re-identification, in: Proc. IEEE International Conference on Computer Vision, pp. 5390–5399.
- An end-to-end heterogeneous restraint network for rgb-d cross-modal person re-identification. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 18, 1–22.
- Discover cross-modality nuances for visible-infrared person re-identification, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4330–4339.
- Group normalization, in: Proc. European Conference on Computer Vision.
- Simulating content consistent vehicle datasets with attribute descent, in: Proc. European Conference on Computer Vision.
- Dynamic tri-level relation mining with attentive graph for visible infrared re-identification. IEEE Transactions on Information Forensics and Security 17, 386–398.
- Bi-directional center-constrained top-ranking for visible thermal person re-identification. IEEE Transactions on Information Forensics and Security 15, 407–419.
- Dynamic dual-attentive aggregation learning for visible-infrared person re-identification, in: Proc. European Conference on Computer Vision, Springer. pp. 229–247.
- Deep learning for person re-identification: A survey and outlook. IEEE Transactions on Pattern Analysis and Machine Intelligence PP.
- Visible thermal person re-identification via dual-constrained top-ranking, in: Proc. International Joint Conference on Artificial Intelligence.
- Towards a unified middle modality learning for visible-infrared person re-identification, in: Proceedings of the 29th ACM International Conference on Multimedia, Association for Computing Machinery, New York, NY, USA. p. 788–796. URL: https://doi.org/10.1145/3474085.3475250, doi:10.1145/3474085.3475250.
- Heterogeneous relational complement for vehicle re-identification, in: Proc. IEEE/CVF International Conference on Computer Vision, pp. 205–214.
- Robust multi-modality person re-identification, in: Proc. AAAI Conference on Artificial Intelligence, pp. 3529–3537.
- Scalable person re-identification: A benchmark, in: Proc. IEEE/CVF International Conference on Computer Vision, pp. 1116–1124.
- Omni-scale feature learning for person re-identification, in: Proc. IEEE/CVF International Conference on Computer Vision, pp. 3701–3711.
- Hetero-center loss for cross-modality person re-identification. Neurocomputing 386, 97–109.
- Aihua Zheng (30 papers)
- Xianpeng Zhu (1 paper)
- Zhiqi Ma (5 papers)
- Chenglong Li (94 papers)
- Jin Tang (139 papers)
- Jixin Ma (4 papers)