Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Graph Convolution Based Efficient Re-Ranking for Visual Retrieval (2306.08792v1)

Published 15 Jun 2023 in cs.CV

Abstract: Visual retrieval tasks such as image retrieval and person re-identification (Re-ID) aim at effectively and thoroughly searching images with similar content or the same identity. After obtaining retrieved examples, re-ranking is a widely adopted post-processing step to reorder and improve the initial retrieval results by making use of the contextual information from semantically neighboring samples. Prevailing re-ranking approaches update distance metrics and mostly rely on inefficient crosscheck set comparison operations while computing expanded neighbors based distances. In this work, we present an efficient re-ranking method which refines initial retrieval results by updating features. Specifically, we reformulate re-ranking based on Graph Convolution Networks (GCN) and propose a novel Graph Convolution based Re-ranking (GCR) for visual retrieval tasks via feature propagation. To accelerate computation for large-scale retrieval, a decentralized and synchronous feature propagation algorithm which supports parallel or distributed computing is introduced. In particular, the plain GCR is extended for cross-camera retrieval and an improved feature propagation formulation is presented to leverage affinity relationships across different cameras. It is also extended for video-based retrieval, and Graph Convolution based Re-ranking for Video (GCRV) is proposed by mathematically deriving a novel profile vector generation method for the tracklet. Without bells and whistles, the proposed approaches achieve state-of-the-art performances on seven benchmark datasets from three different tasks, i.e., image retrieval, person Re-ID and video-based person Re-ID.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (82)
  1. T. Mei, Y. Rui, S. Li, and Q. Tian, “Multimedia search reranking: A literature survey,” ACM Computing Surveys (CSUR), vol. 46, no. 3, pp. 1–38, 2014.
  2. D. Tao et al., “Visual reranking: From objectives to strategies,” IEEE MultiMedia, vol. 18, no. 3, pp. 12–21, 2011.
  3. Z. Zhong, L. Zheng, D. Cao, and S. Li, “Re-ranking person re-identification with k-reciprocal encoding,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.   Honolulu, HI, USA: IEEE, 2017, pp. 1318–1327.
  4. M. Saquib Sarfraz, A. Schumann, A. Eberle, and R. Stiefelhagen, “A pose-sensitive embedding for person re-identification with expanded cross neighborhood re-ranking,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.   Salt Lake City, UT, USA: IEEE, 2018, pp. 420–429.
  5. O. Chum, J. Philbin, J. Sivic, M. Isard, and A. Zisserman, “Total recall: Automatic query expansion with a generative feature model for object retrieval,” in IEEE International Conference on Computer Vision.   Rio de Janeiro, Brazil: IEEE, 2007, pp. 1–8.
  6. Y. Shen, H. Li, T. Xiao, S. Yi, D. Chen, and X. Wang, “Deep group-shuffling random walk for person re-identification,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.   Salt Lake City, UT, USA: IEEE, 2018, pp. 2265–2274.
  7. Y. Shen, H. Li, S. Yi, D. Chen, and X. Wang, “Person re-identification with deep similarity-guided graph neural network,” in Proceedings of the European conference on computer vision (ECCV).   Munich, Germany: Springer, 2018, pp. 486–504.
  8. Y. Wu, O. E. F. Bourahla, X. Li, F. Wu, Q. Tian, and X. Zhou, “Adaptive graph representation learning for video person re-identification,” IEEE Transactions on Image Processing, vol. 29, pp. 8821–8830, 2020.
  9. X. Zhang, M. Jiang, Z. Zheng, X. Tan, E. Ding, and Y. Yang, “Understanding image retrieval re-ranking: A graph neural network perspective,” arXiv preprint arXiv:2012.07620, 2020.
  10. Y. Zhang, Q. Qian, C. Liu, W. Chen, F. Wang, H. Li, and R. Jin, “Graph convolution for re-ranking in person re-identification,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).   IEEE, 2022, pp. 2704–2708.
  11. Y. Sun, L. Zheng, W. Deng, and S. Wang, “Svdnet for pedestrian retrieval,” in Proceedings of the IEEE International Conference on Computer Vision.   Venice, Italy: IEEE, 2017, pp. 3800–3808.
  12. Z. Zheng, L. Zheng, and Y. Yang, “A discriminatively learned cnn embedding for person reidentification,” ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), vol. 14, no. 1, pp. 1–20, 2017.
  13. W. Chen, X. Chen, J. Zhang, and K. Huang, “A multi-task deep network for person re-identification,” in Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence.   San Francisco, California, USA: AAAI, 2017, pp. 3988–3994.
  14. H. Luo, Y. Gu, X. Liao, S. Lai, and W. Jiang, “Bag of tricks and a strong baseline for deep person re-identification,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops.   Long Beach, CA, USA: IEEE, 2019, pp. 0–0.
  15. H. Luo, W. Jiang, Y. Gu, F. Liu, X. Liao, S. Lai, and J. Gu, “A strong baseline and batch normalization neck for deep person re-identification,” IEEE Transactions on Multimedia, vol. 22, no. 10, pp. 2597–2609, 2019.
  16. Y. Sun, L. Zheng, Y. Yang, Q. Tian, and S. Wang, “Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline),” in Proceedings of the European Conference on Computer Vision (ECCV).   Munich, Germany: Springer, 2018, pp. 480–496.
  17. Y. Suh, J. Wang, S. Tang, T. Mei, and K. Mu Lee, “Part-aligned bilinear representations for person re-identification,” in Proceedings of the European Conference on Computer Vision (ECCV).   Munich, Germany: Springer, 2018, pp. 402–419.
  18. D. Li, X. Chen, Z. Zhang, and K. Huang, “Learning deep context-aware features over body and latent parts for person re-identification,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.   Honolulu, HI, USA: IEEE, 2017, pp. 384–393.
  19. X. Zhang, H. Luo, X. Fan, W. Xiang, Y. Sun, Q. Xiao, W. Jiang, C. Zhang, and J. Sun, “Alignedreid: Surpassing human-level performance in person re-identification,” arXiv preprint arXiv:1711.08184, vol. arXiv, no. preprint, pp. 1–1, 2017.
  20. Y. Zhang, Y. Huang, S. Yu, and L. Wang, “Cross-view gait recognition by discriminative feature learning,” IEEE Transactions on Image Processing, vol. 29, pp. 1001–1015, 2019.
  21. W. Liu, Y. Wen, Z. Yu, M. Li, B. Raj, and L. Song, “Sphereface: Deep hypersphere embedding for face recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.   Honolulu, HI, USA: IEEE, 2017, pp. 212–220.
  22. H. Wang, Y. Wang, Z. Zhou, X. Ji, D. Gong, J. Zhou, Z. Li, and W. Liu, “Cosface: Large margin cosine loss for deep face recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.   Salt Lake City, UT, USA: IEEE, 2018, pp. 5265–5274.
  23. J. Deng, J. Guo, N. Xue, and S. Zafeiriou, “Arcface: Additive angular margin loss for deep face recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.   Long Beach, CA, USA: IEEE, 2019, pp. 4690–4699.
  24. R. Hadsell, S. Chopra, and Y. LeCun, “Dimensionality reduction by learning an invariant mapping,” in IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2.   New York, NY, USA: IEEE, 2006, pp. 1735–1742.
  25. F. Schroff, D. Kalenichenko, and J. Philbin, “Facenet: A unified embedding for face recognition and clustering,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.   Boston, MA, USA: IEEE, 2015, pp. 815–823.
  26. Z. Zhong, L. Zheng, G. Kang, S. Li, and Y. Yang, “Random erasing data augmentation.” in AAAI.   New York, NY, USA: AAAI, 2020, pp. 13 001–13 008.
  27. N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: a simple way to prevent neural networks from overfitting,” The Journal of Machine Learning Research, vol. 15, no. 1, pp. 1929–1958, 2014.
  28. Z. Yu, Y. Zhao, B. Hong, Z. Jin, J. Huang, D. Cai, and X.-S. Hua, “Apparel-invariant feature learning for person re-identification,” IEEE Transactions on Multimedia, vol. 24, pp. 4482–4492, 2021.
  29. C. Luo, Y. Chen, N. Wang, and Z. Zhang, “Spectral feature transformation for person re-identification,” in Proceedings of the IEEE International Conference on Computer Vision.   Seoul, Korea: IEEE, 2019, pp. 4976–4985.
  30. H. Jegou, H. Harzallah, and C. Schmid, “A contextual dissimilarity measure for accurate and efficient image search,” in IEEE Conference on Computer Vision and Pattern Recognition.   Minneapolis, MN, USA: IEEE, 2007, pp. 1–8.
  31. D. Qin, S. Gammeter, L. Bossard, T. Quack, and L. Van Gool, “Hello neighbor: Accurate object retrieval with k-reciprocal nearest neighbors,” in IEEE Conference on Computer Vision and Pattern Recognition.   Colorado Springs, CO, USA: IEEE, 2011, pp. 777–784.
  32. J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman, “Object retrieval with large vocabularies and fast spatial matching,” in IEEE Conference on Computer Vision and Pattern Recognition.   IEEE, 2007, pp. 1–8.
  33. O. Siméoni, Y. Avrithis, and O. Chum, “Local features and visual words emerge in activations,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 11 651–11 660.
  34. B. Cao, A. Araujo, and J. Sim, “Unifying deep local and global features for image search,” in European Conference on Computer Vision.   Springer, 2020, pp. 726–743.
  35. P.-E. Sarlin, D. DeTone, T. Malisiewicz, and A. Rabinovich, “Superglue: Learning feature matching with graph neural networks,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 4938–4947.
  36. F. Tan, J. Yuan, and V. Ordonez, “Instance-level image retrieval using reranking transformers,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 12 105–12 115.
  37. Y. Zhu, W. Xu, J. Zhang, Y. Du, J. Zhang, Q. Liu, C. Yang, and S. Wu, “A survey on graph structure learning: Progress and opportunities,” arXiv e-prints, pp. arXiv–2103, 2021.
  38. W. Jin, Y. Ma, X. Liu, X. Tang, S. Wang, and J. Tang, “Graph structure learning for robust graph neural networks,” in ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2020, pp. 66–74.
  39. P. Ji, T. Zhang, H. Li, M. Salzmann, and I. Reid, “Deep subspace clustering networks,” Advances in neural information processing systems, vol. 30, 2017.
  40. T. Zhang, P. Ji, M. Harandi, W. Huang, and H. Li, “Neural collaborative subspace clustering,” in International Conference on Machine Learning.   PMLR, 2019, pp. 7384–7393.
  41. T. Zhang, P. Ji, M. Harandi, R. Hartley, and I. Reid, “Scalable deep k-subspace clustering,” in Asian Conference on Computer Visio.   Springer, 2019, pp. 466–481.
  42. T. N. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks,” in International Conference on Learning Representations.   Toulon, France: OpenReview.net, 2016, pp. 1–1.
  43. L. Zheng, L. Shen, L. Tian, S. Wang, J. Wang, and Q. Tian, “Scalable person re-identification: A benchmark,” in Proceedings of the IEEE International Conference on Computer Vision.   Santiago, Chile: IEEE, 2015, pp. 1116–1124.
  44. E. Ristani, F. Solera, R. Zou, R. Cucchiara, and C. Tomasi, “Performance measures and a data set for multi-target, multi-camera tracking,” in European Conference on Computer Vision.   Amsterdam, The Netherlands: Springer, 2016, pp. 17–35.
  45. W. Li, R. Zhao, T. Xiao, and X. Wang, “Deepreid: Deep filter pairing neural network for person re-identification,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2014, pp. 152–159.
  46. L. Wei, S. Zhang, W. Gao, and Q. Tian, “Person transfer gan to bridge domain gap for person re-identification,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 79–88.
  47. L. Zheng, Z. Bie, Y. Sun, J. Wang, C. Su, S. Wang, and Q. Tian, “Mars: A video benchmark for large-scale person re-identification,” in European Conference on Computer Vision.   Amsterdam, The Netherlands: Springer, 2016, pp. 868–884.
  48. F. Radenović, A. Iscen, G. Tolias, Y. Avrithis, and O. Chum, “Revisiting oxford and paris: Large-scale image retrieval benchmarking,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 5706–5715.
  49. S. Bai, Z. Zhou, J. Wang, X. Bai, L. Jan Latecki, and Q. Tian, “Ensemble diffusion for retrieval,” in Proceedings of the IEEE International Conference on Computer Vision.   Venice, Italy: IEEE, 2017, pp. 774–783.
  50. S. Bai, P. Tang, P. H. Torr, and L. J. Latecki, “Re-ranking via metric fusion for object retrieval and person re-identification,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.   Long Beach, CA, USA: IEEE, 2019, pp. 740–749.
  51. K. Zhu, H. Guo, Z. Liu, M. Tang, and J. Wang, “Identity-guided human semantic parsing for person re-identification,” in Proceedings of the European conference on computer vision (ECCV).   Glasgow, UK: Springer, 2020, pp. 346–363.
  52. C. Ding, K. Wang, P. Wang, and D. Tao, “Multi-task learning with coarse priors for robust part-aware person re-identification,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 3, pp. 1474–1488, 2020.
  53. Y. Sun, C. Cheng, Y. Zhang, C. Zhang, L. Zheng, Z. Wang, and Y. Wei, “Circle loss: A unified perspective of pair similarity optimization,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.   Seattle, WA, USA: IEEE, 2020, pp. 6398–6407.
  54. X. Bai, M. Yang, T. Huang, Z. Dou, R. Yu, and Y. Xu, “Deep-person: Learning discriminative deep features for person re-identification,” Pattern Recognition, vol. 98, p. 107036, 2020.
  55. K. Zhou, Y. Yang, A. Cavallaro, and T. Xiang, “Learning generalisable omni-scale representations for person re-identification,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 9, pp. 5056–5069, 2022.
  56. M. Ye, J. Shen, G. Lin, T. Xiang, L. Shao, and S. C. Hoi, “Deep learning for person re-identification: A survey and outlook,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 6, pp. 2872–2893, 2021.
  57. X. Gong, Z. Yao, X. Li, Y. Fan, B. Luo, J. Fan, and B. Lao, “Lag-net: Multi-granularity network for person re-identification via local attention system,” IEEE Transactions on Multimedia, vol. 24, pp. 217–229, 2021.
  58. C. Yan, G. Pang, X. Bai, C. Liu, X. Ning, L. Gu, and J. Zhou, “Beyond triplet loss: Person re-identification with fine-grained difference-aware pairwise loss,” IEEE Transactions on Multimedia, vol. 24, pp. 1665–1677, 2021.
  59. H. Gu, J. Li, G. Fu, C. Wong, X. Chen, and J. Zhu, “Autoloss-gms: Searching generalized margin-based softmax loss function for person re-identification,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 4744–4753.
  60. G. Wu, X. Zhu, and S. Gong, “Learning hybrid ranking representation for person re-identification,” Pattern Recognition, vol. 121, p. 108239, 2022.
  61. T. Si, F. He, H. Wu, and Y. Duan, “Spatial-driven features based on image dependencies for person re-identification,” Pattern Recognition, vol. 124, p. 108462, 2022.
  62. L. Wu, Y. Wang, J. Gao, and X. Li, “Where-and-when to look: Deep siamese attention networks for video-based person re-identification,” IEEE Transactions on Multimedia, vol. 21, no. 6, pp. 1412–1424, 2018.
  63. A. Porrello, L. Bergamini, and S. Calderara, “Robust re-identification by multiple views knowledge distillation,” in European Conference on Computer Vision.   Glasgow, UK: Springer, 2020, pp. 93–110.
  64. Y. Yan, J. Qin, J. Chen, L. Liu, F. Zhu, Y. Tai, and L. Shao, “Learning multi-granular hypergraphs for video-based person re-identification,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.   Seattle, WA, USA: IEEE, 2020, pp. 2899–2908.
  65. J. Zhao, F. Qi, G. Ren, and L. Xu, “Phd learning: Learning with pompeiu-hausdorff distances for video-based vehicle re-identification,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 2225–2235.
  66. A. Aich, M. Zheng, S. Karanam, T. Chen, A. K. Roy-Chowdhury, and Z. Wu, “Spatio-temporal representation factorization for video-based person re-identification,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 152–162.
  67. T. He, X. Jin, X. Shen, J. Huang, Z. Chen, and X.-S. Hua, “Dense interaction learning for video-based person re-identification,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 1490–1501.
  68. X. Zang, G. Li, and W. Gao, “Multidirection and multiscale pyramid in transformer for video-based pedestrian retrieval,” IEEE Transactions on Industrial Informatics, vol. 18, no. 12, pp. 8776–8785, 2022.
  69. S. Bai, B. Ma, H. Chang, R. Huang, and X. Chen, “Salient-to-broad transition for video person re-identification,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 7339–7348.
  70. Z. Tang, R. Zhang, Z. Peng, J. Chen, and L. Lin, “Multi-stage spatio-temporal aggregation transformer for video person re-identification,” IEEE Transactions on Multimedia, 2022.
  71. Y. Yao, X. Jiang, H. Fujita, and Z. Fang, “A sparse graph wavelet convolution neural network for video-based person re-identification,” Pattern Recognition, vol. 129, p. 108708, 2022.
  72. A. El-Nouby, N. Neverova, I. Laptev, and H. Jégou, “Training vision transformers for image retrieval,” arXiv preprint arXiv:2102.05644, 2021.
  73. G. Tolias, T. Jenicek, and O. Chum, “Learning and aggregating deep local descriptors for instance-level recognition,” in European Conference on Computer Vision.   Springer, 2020, pp. 460–477.
  74. M. Yang, D. He, M. Fan, B. Shi, X. Xue, F. Li, E. Ding, and J. Huang, “Dolg: Single-stage image retrieval with deep orthogonal fusion of local and global features,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 11 772–11 781.
  75. Y. Song, R. Zhu, M. Yang, and D. He, “Dalg: Deep attentive local and global modeling for image retrieval,” arXiv preprint arXiv:2207.00287, 2022.
  76. X. Zhu, H. Wang, P. Liu, Z. Yang, and J. Qian, “Graph-based reasoning attention pooling with curriculum design for content-based image retrieval,” Image and Vision Computing, vol. 115, p. 104289, 2021.
  77. H. Wu, M. Wang, W. Zhou, and H. Li, “Learning deep local features with multiple dynamic attentions for large-scale image retrieval,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 11 416–11 425.
  78. H. Wu, M. Wang, W. Zhou, H. Li, and Q. Tian, “Contextual similarity distillation for asymmetric image retrieval,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 9489–9498.
  79. H. Wu, M. Wang, W. Zhou, Y. Hu, and H. Li, “Learning token-based representation for image retrieval,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 3, 2022, pp. 2703–2711.
  80. F. Radenović, G. Tolias, and O. Chum, “Fine-tuning cnn image retrieval with no human annotation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 7, pp. 1655–1668, 2018.
  81. C.-T. Liu, C.-W. Wu, Y.-C. F. Wang, and S.-Y. Chien, “Spatially and temporally efficient non-local attention network for video-based person re-identification,” in British Machine Vision Conference.   Cardiff, UK: BMVA Press, 2019, p. 243.
  82. L. v. d. Maaten and G. Hinton, “Visualizing data using t-sne,” Journal of Machine Learning Research, vol. 9, no. Nov, pp. 2579–2605, 2008.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Yuqi Zhang (54 papers)
  2. Qi Qian (54 papers)
  3. Hongsong Wang (25 papers)
  4. Chong Liu (104 papers)
  5. Weihua Chen (35 papers)
  6. Fan Wang (313 papers)
Citations (11)