Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MeGA-CDA: Memory Guided Attention for Category-Aware Unsupervised Domain Adaptive Object Detection (2103.04224v2)

Published 7 Mar 2021 in cs.CV

Abstract: Existing approaches for unsupervised domain adaptive object detection perform feature alignment via adversarial training. While these methods achieve reasonable improvements in performance, they typically perform category-agnostic domain alignment, thereby resulting in negative transfer of features. To overcome this issue, in this work, we attempt to incorporate category information into the domain adaptation process by proposing Memory Guided Attention for Category-Aware Domain Adaptation (MeGA-CDA). The proposed method consists of employing category-wise discriminators to ensure category-aware feature alignment for learning domain-invariant discriminative features. However, since the category information is not available for the target samples, we propose to generate memory-guided category-specific attention maps which are then used to route the features appropriately to the corresponding category discriminator. The proposed method is evaluated on several benchmark datasets and is shown to outperform existing approaches.

Citations (167)

Summary

Unsupervised Domain Adaptive Object Detection with MeGA-CDA

The paper "MeGA-CDA: Memory Guided Attention for Category-Aware Unsupervised Domain Adaptive Object Detection" introduces a novel approach to improve unsupervised domain adaptation (UDA) in object detection tasks. Traditional UDA methods in object detection employ adversarial training to align feature distributions between source and target domains, typically resulting in category-agnostic alignments. This can lead to negative transfer of features, whereby inappropriate feature alignments occur across different object categories, thus degrading detection performance. In contrast, the authors propose a method that leverages category-aware feature alignment, facilitated by memory-guided attention, to address these issues and improve adaptation performance.

Methodology

The authors present Memory Guided Attention for Category-Aware Domain Adaptation (MeGA-CDA), an innovative framework comprising the following components:

  1. Global Domain Alignment (GDA): Utilizes a global discriminator operating on the entire feature map to perform domain-wide feature alignment. This component attempts to align overall feature distributions between domains without specific attention to category-specific characteristics. Although beneficial, it risks inducing negative feature transfer due to its category-agnostic nature.
  2. Category-Wise Discriminators (CDA): A set of discriminators tailored to align features specific to different categories. These discriminators help ensure that features corresponding to a particular object class are correctly matched between source and target domains.
  3. Memory-Guided Attention (MeGA): To facilitate category-specific feature alignment when annotations are unavailable for the target domain, memory networks are employed. These networks generate attention maps focused on specific categories by leveraging stored prototypes (or class-specific features) which guide features to appropriate discriminators.

During training, the target and source feature maps are passed through global and category-specific discriminators, aided by attention maps generated from memory networks. The memory networks store category-specific feature prototypes and are updated via explicit write operations using source domain data—where annotations are available.

Experimental Evaluation

The effectiveness of MeGA-CDA has been demonstrated across multiple benchmark datasets exhibiting domain shifts such as weather variations, synthetic-to-real, and cross-camera differences. Across these datasets, the proposed methodology significantly outperformed existing state-of-the-art domain adaptation techniques. For example, MeGA-CDA achieved significant improvements in mean average precision (mAP) on the Cityscapes to Foggy Cityscapes adaptation task, highlighting its capability to handle adverse weather conditions.

Implications and Future Work

Practically, the memory-guided attention which incorporates category-level insights can be highly beneficial to object detection systems deployed in dynamic real-world environments, such as autonomous vehicles required to adapt to new cities or weather conditions. Theoretically, the research paves the way for more sophisticated category-aware feature alignment mechanisms within UDA contexts. Future work might explore memory module optimization and the extension of memory-guided attention to other vision tasks like semantic segmentation or instance segmentation, where category-specific feature discrimination plays a crucial role. Additionally, expanding the approach to semi-supervised scenarios where some target domain annotations are accessible could further enhance model adaptability and accuracy.

Overall, MeGA-CDA represents a significant advancement in domain adaptive object detection by successfully addressing category-specific feature alignment challenges through innovative use of memory networks and attention mechanisms.