Frequency Domain Modality-invariant Feature Learning for Visible-infrared Person Re-Identification (2401.01839v2)
Abstract: Visible-infrared person re-identification (VI-ReID) is challenging due to the significant cross-modality discrepancies between visible and infrared images. While existing methods have focused on designing complex network architectures or using metric learning constraints to learn modality-invariant features, they often overlook which specific component of the image causes the modality discrepancy problem. In this paper, we first reveal that the difference in the amplitude component of visible and infrared images is the primary factor that causes the modality discrepancy and further propose a novel Frequency Domain modality-invariant feature learning framework (FDMNet) to reduce modality discrepancy from the frequency domain perspective. Our framework introduces two novel modules, namely the Instance-Adaptive Amplitude Filter (IAF) module and the Phrase-Preserving Normalization (PPNorm) module, to enhance the modality-invariant amplitude component and suppress the modality-specific component at both the image- and feature-levels. Extensive experimental results on two standard benchmarks, SYSU-MM01 and RegDB, demonstrate the superior performance of our FDMNet against state-of-the-art methods.
- Frequency domain image translation: More photo-realistic, better identity-preserving. In CVPR, pages 13930–13940, 2021.
- Amplitude-phase recombination: Rethinking robustness of convolutional neural networks in frequency domain. In ICCV, pages 458–467, 2021a.
- Neural feature search for rgb-infrared person re-identification. In CVPR, pages 587–597, 2021b.
- Fast non-local neural networks with spectral residual learning. In ACM MM, pages 2142–2151, 2019.
- Hi-cmd: Hierarchical cross-modality disentanglement for visible-infrared person re-identification. In CVPR, pages 10257–10266, 2020.
- Cross-modality person re-identification with generative adversarial training. In IJCAI, page 2, 2018.
- Shape-erased feature learning for visible-infrared person re-identification. In CVPR, pages 22752–22761, 2023.
- Fftw: An adaptive software architecture for the fft. In Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, pages 1381–1384. IEEE, 1998.
- Cm-nas: Cross-modality neural architecture search for visible-infrared person re-identification. In ICCV, pages 11823–11832, 2021.
- Generative adversarial nets. NeurIps, 27, 2014.
- Cross-modality person re-identification via modality confusion and center aggregation. In ICCV, pages 16403–16412, 2021.
- Deep residual learning for image recognition. In CVPR, pages 770–778, 2016.
- Transreid: Transformer-based object re-identification. In ICCV, pages 15013–15022, 2021.
- In defense of the triplet loss for person re-identification. arXiv preprint arXiv:1703.07737, 2017.
- Modality-adaptive mixup and invariant decomposition for rgb-infrared person re-identification. In AAAI, pages 1034–1042, 2022.
- Cross-modality transformer for visible-infrared person re-identification. In ECCV, pages 480–496, 2022.
- Focal frequency loss for image reconstruction and synthesis. In ICCV, pages 13919–13929, 2021.
- Decompose, adjust, compose: Effective normalization by playing with frequency for domain generalization. arXiv preprint arXiv:2303.02328, 2023.
- Infrared-visible cross-modal person re-identification with an x modality. In AAAI, pages 4610–4617, 2020.
- Harmonious attention network for person re-identification. In CVPR, pages 2285–2294, 2018.
- Diverse part discovery: Occluded person re-identification with part-aware transformer. In CVPR, pages 2898–2907, 2021.
- Visible-infrared person re-identification with modality-specific memory network. TIP, 31:7165–7178, 2022.
- Parameter sharing exploration and hetero-center triplet loss for visible-thermal person re-identification. IEEE Transactions on Multimedia, 2020.
- Learning memory-augmented unidirectional metrics for cross-modality person re-identification. In CVPR, pages 19366–19375, 2022.
- Person recognition system based on a combination of body images from visible light and thermal cameras. Sensors, 17(3):605, 2017.
- The importance of phase in signals. Proceedings of the IEEE, 69(5):529–541, 1981.
- A demonstration of the visual importance and flexibility of spatial-frequency amplitude and phase. Perception, 11(3):337–346, 1982.
- Global filter networks for image classification. NeurIps, 34:980–993, 2021.
- Deep perceptual mapping for cross-modal face recognition. International Journal of Computer Vision, pages 426–438, 2017.
- Not all pixels are matched: Dense contrastive learning for cross-modality person re-identification. In ACM MM, pages 5333–5341, 2022.
- Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline). In ECCV, pages 480–496, 2018.
- Rgb-infrared cross-modality person re-identification via joint pixel and feature alignment. In ICCV, pages 3623–3632, 2019a.
- Cross-modality paired-images generation for rgb-infrared person re-identification. In AAAI, pages 12144–12151, 2020a.
- High-frequency component helps explain the generalization of convolutional neural networks. In CVPR, pages 8684–8694, 2020b.
- Learning to reduce dual-level discrepancy for infrared-visible person re-identification. In CVPR, pages 618–626, 2019b.
- Co-attentive lifting for infrared-visible person re-identification. In Proceedings of the 28th ACM International Conference on Multimedia, pages 1028–1037, 2020.
- Syncretic modality collaborative learning for visible infrared person re-identification. In ICCV, pages 225–234, 2021.
- Rgb-infrared cross-modality person re-identification. In ICCV, pages 5380–5389, 2017.
- Discover cross-modality nuances for visible-infrared person re-identification. In CVPR, pages 4330–4339, 2021.
- Learning in the frequency domain. In CVPR, pages 1740–1749, 2020.
- A fourier-based framework for domain generalization. In CVPR, pages 14383–14392, 2021.
- Hierarchical discriminative learning for visible thermal person re-identification. In AAAI, 2018a.
- Visible thermal person re-identification via dual-constrained top-ranking. In IJCAI, page 2, 2018b.
- Cross-modality person re-identification via modality-aware collaborative ensemble learning. TIP, 29:9387–9399, 2020a.
- Dynamic dual-attentive aggregation learning for visible-infrared person re-identification. In ECCV, pages 229–247. Springer, 2020b.
- Visible-infrared person re-identification via homogeneous augmented tri-modal learning. IEEE Transactions on Information Forensics and Security, 16:728–739, 2020c.
- Channel augmented joint learning for visible-infrared recognition. In ICCV, pages 13567–13576, 2021a.
- Deep learning for person re-identification: A survey and outlook. IEEE transactions on pattern analysis and machine intelligence, 44(6):2872–2893, 2021b.
- Fmcnet: Feature-level modality compensation for visible-infrared person re-identification. In CVPR, pages 7349–7358, 2022.
- Mrcn: A novel modality restitution and compensation network for visible-infrared person re-identification. arXiv preprint arXiv:2303.14626, 2023.
- Person re-identification: Past, present and future. arXiv preprint arXiv:1610.02984, 2016.
- Re-ranking person re-identification with k-reciprocal encoding. In CVPR, pages 1318–1327, 2017.
- Random erasing data augmentation. In AAAI, pages 13001–13008, 2020.
- Dual cross-attention learning for fine-grained visual categorization and object re-identification. In CVPR, pages 4692–4702, 2022.
- Yulin Li (35 papers)
- Tianzhu Zhang (61 papers)
- Yongdong Zhang (119 papers)