Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Robust Domain Adaptive Object Detection with Unified Multi-Granularity Alignment (2301.00371v2)

Published 1 Jan 2023 in cs.CV

Abstract: Domain adaptive detection aims to improve the generalization of detectors on target domain. To reduce discrepancy in feature distributions between two domains, recent approaches achieve domain adaption through feature alignment in different granularities via adversarial learning. However, they neglect the relationship between multiple granularities and different features in alignment, degrading detection. Addressing this, we introduce a unified multi-granularity alignment (MGA)-based detection framework for domain-invariant feature learning. The key is to encode the dependencies across different granularities including pixel-, instance-, and category-levels simultaneously to align two domains. Specifically, based on pixel-level features, we first develop an omni-scale gated fusion (OSGF) module to aggregate discriminative representations of instances with scale-aware convolutions, leading to robust multi-scale detection. Besides, we introduce multi-granularity discriminators to identify where, either source or target domains, different granularities of samples come from. Note that, MGA not only leverages instance discriminability in different categories but also exploits category consistency between two domains for detection. Furthermore, we present an adaptive exponential moving average (AEMA) strategy that explores model assessments for model update to improve pseudo labels and alleviate local misalignment problem, boosting detection robustness. Extensive experiments on multiple domain adaption scenarios validate the superiority of MGA over other approaches on FCOS and Faster R-CNN detectors. Code will be released at https://github.com/tiankongzhang/MGA.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (91)
  1. W. Zhou, D. Du, L. Zhang, T. Luo, and Y. Wu, “Multi-granularity alignment domain adaptation for object detection,” in CVPR, 2022.
  2. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in CVPR, 2016.
  3. A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” NIPS, 2012.
  4. K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” in ICLR, 2015.
  5. R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” in CVPR, 2014.
  6. R. Girshick, “Fast r-cnn,” in ICCV, 2015.
  7. S. Ren, K. He, R. B. Girshick, and J. Sun, “Faster R-CNN: towards real-time object detection with region proposal networks,” TPAMI, vol. 39, no. 6, pp. 1137–1149, 2017.
  8. Z. Tian, C. Shen, H. Chen, and T. He, “FCOS: fully convolutional one-stage object detection,” in ICCV, 2019.
  9. H. Law and J. Deng, “Cornernet: Detecting objects as paired keypoints,” in ECCV, 2018.
  10. W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C. Berg, “Ssd: Single shot multibox detector,” in ECCV, 2016.
  11. J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” in CVPR, 2016.
  12. T.-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie, “Feature pyramid networks for object detection,” in CVPR, 2017.
  13. J. Liang, Y. Cui, Q. Wang, T. Geng, W. Wang, and D. Liu, “Clusterfomer: Clustering as a universal visual learner,” NeurIPS, 2023.
  14. T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, “Microsoft coco: Common objects in context,” in ECCV, 2014.
  15. Y. Ganin and V. S. Lempitsky, “Unsupervised domain adaptation by backpropagation,” in ICML, 2015.
  16. K. Saito, Y. Ushiku, T. Harada, and K. Saenko, “Strong-weak distribution alignment for adaptive object detection,” in CVPR, 2019.
  17. E. Tzeng, J. Hoffman, K. Saenko, and T. Darrell, “Adversarial discriminative domain adaptation,” in CVPR, 2017.
  18. T. Kim, M. Jeong, S. Kim, S. Choi, and C. Kim, “Diversify and match: A domain adaptive representation learning paradigm for object detection,” in CVPR, 2019.
  19. C. Hsu, Y. Tsai, Y. Lin, and M. Yang, “Every pixel matters: Center-aware feature alignment for domain adaptive object detector,” in ECCV, 2020.
  20. H.-K. Hsu, C.-H. Yao, Y.-H. Tsai, W.-C. Hung, H.-Y. Tseng, M. Singh, and M.-H. Yang, “Progressive domain adaptation for object detection,” in WACV, 2020.
  21. Q. Cai, Y. Pan, C.-W. Ngo, X. Tian, L. Duan, and T. Yao, “Exploring object relation in mean teacher for cross-domain detection,” in CVPR, 2019.
  22. Z. He and L. Zhang, “Multi-adversarial faster-rcnn for unrestricted object detection,” in ICCV, 2019.
  23. Y. Chen, W. Li, C. Sakaridis, D. Dai, and L. V. Gool, “Domain adaptive faster R-CNN for object detection in the wild,” in CVPR, 2018.
  24. C. Li, D. Du, L. Zhang, L. Wen, T. Luo, Y. Wu, and P. Zhu, “Spatial attention pyramid network for unsupervised domain adaptation,” in ECCV, 2020.
  25. X. Zhu, J. Pang, C. Yang, J. Shi, and D. Lin, “Adapting object detectors via selective cross-domain alignment,” in CVPR, 2019.
  26. L. Du, J. Tan, H. Yang, J. Feng, X. Xue, Q. Zheng, X. Ye, and X. Zhang, “SSF-DAN: separated semantic feature based domain adaptation network for semantic segmentation,” in ICCV, 2019.
  27. L. Hu, M. Kan, S. Shan, and X. Chen, “Unsupervised domain adaptation with hierarchical gradient synchronization,” in CVPR, 2020.
  28. S. Paul, Y. Tsai, S. Schulter, A. K. Roy-Chowdhury, and M. Chandraker, “Domain adaptive semantic segmentation using weak labels,” in ECCV, 2020.
  29. H. Wang, T. Shen, W. Zhang, L. Duan, and T. Mei, “Classes matter: A fine-grained adversarial approach to cross-domain semantic segmentation,” in ECCV, 2020.
  30. C. Xu, X. Zhao, X. Jin, and X. Wei, “Exploring categorical regularization for domain adaptive object detection,” in CVPR, 2020.
  31. M. Xu, H. Wang, B. Ni, Q. Tian, and W. Zhang, “Cross-domain detection via graph-induced prototype alignment,” in CVPR, 2020.
  32. M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele, “The cityscapes dataset for semantic urban scene understanding,” in CVPR, 2016.
  33. C. Sakaridis, D. Dai, and L. V. Gool, “Semantic foggy scene understanding with synthetic data,” IJCV, vol. 126, no. 9, pp. 973–992, 2018.
  34. M. Johnson-Roberson, C. Barto, R. Mehta, S. N. Sridhar, K. Rosaen, and R. Vasudevan, “Driving in the matrix: Can virtual worlds replace human-generated annotations for real world tasks?” in ICRA, 2017.
  35. A. Geiger, P. Lenz, and R. Urtasun, “Are we ready for autonomous driving? the KITTI vision benchmark suite,” in CVPR, 2012.
  36. M. Everingham, L. V. Gool, C. K. I. Williams, J. M. Winn, and A. Zisserman, “The pascal visual object classes (VOC) challenge,” IJCV, vol. 88, no. 2, pp. 303–338, 2010.
  37. N. Inoue, R. Furuta, T. Yamasaki, and K. Aizawa, “Cross-domain weakly-supervised object detection through progressive domain adaptation,” in CVPR, 2018.
  38. J. Redmon and A. Farhadi, “Yolo9000: better, faster, stronger,” in CVPR, 2017.
  39. J. Dai, Y. Li, K. He, and J. Sun, “R-fcn: Object detection via region-based fully convolutional networks,” NIPS, 2016.
  40. K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask r-cnn,” in ICCV, 2017.
  41. T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, “Focal loss for dense object detection,” in ICCV, 2017.
  42. S. Zhang, L. Wen, X. Bian, Z. Lei, and S. Z. Li, “Single-shot refinement neural network for object detection,” in CVPR, 2018.
  43. Z. Cai and N. Vasconcelos, “Cascade r-cnn: Delving into high quality object detection,” in CVPR, 2018.
  44. K. Duan, S. Bai, L. Xie, H. Qi, Q. Huang, and Q. Tian, “Centernet: Keypoint triplets for object detection,” in ICCV, 2019.
  45. N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and S. Zagoruyko, “End-to-end object detection with transformers,” in ECCV, 2020.
  46. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” NIPS, 2017.
  47. V. VS, V. Gupta, P. Oza, V. A. Sindagi, and V. M. Patel, “Mega-cda: Memory guided attention for category-aware unsupervised domain adaptive object detection,” in CVPR, 2021.
  48. L. Zhao and L. Wang, “Task-specific inconsistency alignment for domain adaptive object detection,” in CVPR, 2022.
  49. C. Chen, J. Li, Z. Zheng, Y. Huang, X. Ding, and Y. Yu, “Dual bipartite graph learning: A general approach for domain adaptive object detection,” in ICCV, 2021.
  50. W. Li, X. Liu, and Y. Yuan, “Sigma: Semantic-complete graph matching for domain adaptive object detection,” in CVPR, 2022.
  51. S. Li, M. Ye, X. Zhu, L. Zhou, and L. Xiong, “Source-free object detection by learning to overlook domain style,” in CVPR, 2022.
  52. M. Khodabandeh, A. Vahdat, M. Ranjbar, and W. G. Macready, “A robust learning approach to domain adaptive object detection,” in ICCV, 2019.
  53. S. Li, J. Huang, X.-S. Hua, and L. Zhang, “Category dictionary guided unsupervised domain adaptation for object detection,” in AAAI, 2021.
  54. M. Chen, W. Chen, S. Yang, J. Song, X. Wang, L. Zhang, Y. Yan, D. Qi, Y. Zhuang, D. Xie et al., “Learning domain adaptive object detection with probabilistic teacher,” in ICML, 2022.
  55. F. Yu, D. Wang, Y. Chen, N. Karianakis, T. Shen, P. Yu, D. Lymberopoulos, S. Lu, W. Shi, and X. Chen, “Sc-uda: Style and content gaps aware unsupervised domain adaptation for object detection,” in WACV, 2022.
  56. C. Chen, Z. Zheng, X. Ding, Y. Huang, and Q. Dou, “Harmonizing transferability and discriminability for adapting object detectors,” in CVPR, 2020.
  57. Z. Shen, M. Huang, J. Shi, Z. Liu, H. Maheshwari, Y. Zheng, X. Xue, M. Savvides, and T. S. Huang, “Cdtd: A large-scale cross-domain benchmark for instance-level image-to-image translation and domain adaptive object detection,” IJCV, vol. 129, no. 3, pp. 761–780, 2021.
  58. J. Deng, W. Li, Y. Chen, and L. Duan, “Unbiased mean teacher for cross-domain object detection,” in CVPR, 2021.
  59. M. He, Y. Wang, J. Wu, Y. Wang, H. Li, B. Li, W. Gan, W. Wu, and Y. Qiao, “Cross domain object detection by target-perceived dual branch distillation,” in CVPR, 2022.
  60. Y.-J. Li, X. Dai, C.-Y. Ma, Y.-C. Liu, K. Chen, B. Wu, Z. He, K. Kitani, and P. Vajda, “Cross-domain adaptive teacher for object detection,” in CVPR, 2022.
  61. J. Li, R. Xu, J. Ma, Q. Zou, J. Ma, and H. Yu, “Domain adaptive object detection for autonomous driving under foggy weather,” in WACV, 2023.
  62. S. Cao, D. Joshi, L.-Y. Gui, and Y.-X. Wang, “Contrastive mean teacher for domain adaptive object detectors,” in CVPR, 2023.
  63. J. Jiang, B. Chen, J. Wang, and M. Long, “Decoupled adaptation for cross-domain object detection,” in ICLR, 2022.
  64. T. Chen, S. Kornblith, K. Swersky, M. Norouzi, and G. E. Hinton, “Big self-supervised models are strong semi-supervised learners,” NeurIPS, 2020.
  65. Z. Cai, A. Ravichandran, S. Maji, C. Fowlkes, Z. Tu, and S. Soatto, “Exponential moving average normalization for self-supervised and semi-supervised learning,” in CVPR, 2021.
  66. A. Islam, C.-F. R. Chen, R. Panda, L. Karlinsky, R. Feris, and R. J. Radke, “Dynamic distillation network for cross-domain few-shot recognition with unlabeled data,” NeurIPS, 2021.
  67. M. Xu, Z. Zhang, H. Hu, J. Wang, L. Wang, F. Wei, X. Bai, and Z. Liu, “End-to-end semi-supervised object detection with soft teacher,” in ICCV, 2021.
  68. Q. Yang, X. Wei, B. Wang, X.-S. Hua, and L. Zhang, “Interactive self-training with mean teachers for semi-supervised object detection,” in CVPR, 2021.
  69. Z. Ke, D. Wang, Q. Yan, J. Ren, and R. W. Lau, “Dual student: Breaking the limits of the teacher in semi-supervised learning,” in ICCV, 2019.
  70. Y. Ge, D. Chen, and H. Li, “Mutual mean-teaching: Pseudo label refinery for unsupervised domain adaptation on person re-identification,” in ICLR, 2020.
  71. X. Huo, L. Xie, W. gang Zhou, H. Li, and Q. Tian, “Focus on your target: A dual teacher-student framework for domain-adaptive semantic segmentation,” ICCV, 2023.
  72. W. Ren, L. Ma, J. Zhang, J. Pan, X. Cao, W. Liu, and M. Yang, “Gated fusion network for single image dehazing,” in CVPR, 2018.
  73. T. Lin, P. Dollár, R. B. Girshick, K. He, B. Hariharan, and S. J. Belongie, “Feature pyramid networks for object detection,” in CVPR, 2017.
  74. M. A. Munir, M. H. Khan, M. S. Sarfraz, and M. Ali, “Synergizing between self-training and adversarial learning for domain adaptive object detection,” in NeurIPS, 2021.
  75. Z. He and L. Zhang, “Domain adaptive object detection via asymmetric tri-way faster-rcnn,” in ECCV, 2020.
  76. H. Rezatofighi, N. Tsoi, J. Gwak, A. Sadeghian, I. D. Reid, and S. Savarese, “Generalized intersection over union: A metric and a loss for bounding box regression,” in CVPR, 2019.
  77. J. Yu, Y. Jiang, Z. Wang, Z. Cao, and T. S. Huang, “Unitbox: An advanced object detection network,” in ACM MM, 2016.
  78. T. Lin, P. Goyal, R. B. Girshick, K. He, and P. Dollár, “Focal loss for dense object detection,” in ICCV, 2017.
  79. Q. Cai, Y. Pan, C. Ngo, X. Tian, L. Duan, and T. Yao, “Exploring object relation in mean teacher for cross-domain detection,” in CVPR, 2019.
  80. G. Zhao, G. Li, R. Xu, and L. Lin, “Collaborative training between region proposal localization and classification for domain adaptive object detection,” in ECCV, 2020.
  81. A. Wu, Y. Han, L. Zhu, and Y. Yang, “Instance-invariant domain adaptive object detection via progressive disentanglement,” TPAMI, 2021.
  82. P. Su, K. Wang, X. Zeng, S. Tang, D. Chen, D. Qiu, and X. Wang, “Adapting object detectors with conditional domain normalization,” in ECCV, 2020.
  83. X. Li, W. Chen, D. Xie, S. Yang, P. Yuan, S. Pu, and Y. Zhuang, “A free lunch for unsupervised domain adaptive object detection without source data,” in AAAI, 2021.
  84. Z. Shen, H. Maheshwari, W. Yao, and M. Savvides, “SCL: towards accurate domain adaptive object detection via gradient detach based stacked complementary losses,” CoRR, vol. abs/1911.02559, 2019.
  85. Y. Zheng, D. Huang, S. Liu, and Y. Wang, “Cross-domain object detection through coarse-to-fine feature adaptation,” in CVPR, 2020.
  86. Y. Wang, R. Zhang, S. Zhang, M. Li, Y. Xia, X. Zhang, and S. Liu, “Domain-specific suppression for adaptive object detection,” in CVPR, 2021.
  87. Q. Zhou, Q. Gu, J. Pang, Z. Feng, G. Cheng, X. Lu, J. Shi, and L. Ma, “Self-adversarial disentangling for specific domain adaptation,” arXiv, 2021.
  88. W. Li, X. Liu, X. Yao, and Y. Yuan, “Scan: Cross domain object detection with semantic conditioned adaptation,” in AAAI, 2022.
  89. A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga et al., “Pytorch: An imperative style, high-performance deep learning library,” NeurIPS, 2019.
  90. F. Rezaeianaran, R. Shetty, R. Aljundi, D. O. Reino, S. Zhang, and B. Schiele, “Seeking similarities over differences: Similarity-based domain alignment for adaptive object detection,” in ICCV, 2021.
  91. L. Van der Maaten and G. Hinton, “Visualizing data using t-sne.” JMLR, vol. 9, no. 11, 2008.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Libo Zhang (105 papers)
  2. Wenzhang Zhou (6 papers)
  3. Heng Fan (360 papers)
  4. Tiejian Luo (23 papers)
  5. Haibin Ling (142 papers)
Citations (8)

Summary

We haven't generated a summary for this paper yet.