Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Siamese DETR (2303.18144v1)

Published 31 Mar 2023 in cs.CV

Abstract: Recent self-supervised methods are mainly designed for representation learning with the base model, e.g., ResNets or ViTs. They cannot be easily transferred to DETR, with task-specific Transformer modules. In this work, we present Siamese DETR, a Siamese self-supervised pretraining approach for the Transformer architecture in DETR. We consider learning view-invariant and detection-oriented representations simultaneously through two complementary tasks, i.e., localization and discrimination, in a novel multi-view learning framework. Two self-supervised pretext tasks are designed: (i) Multi-View Region Detection aims at learning to localize regions-of-interest between augmented views of the input, and (ii) Multi-View Semantic Discrimination attempts to improve object-level discrimination for each region. The proposed Siamese DETR achieves state-of-the-art transfer performance on COCO and PASCAL VOC detection using different DETR variants in all setups. Code is available at https://github.com/Zx55/SiameseDETR.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Zeren Chen (8 papers)
  2. Gengshi Huang (3 papers)
  3. Wei Li (1122 papers)
  4. Jianing Teng (4 papers)
  5. Kun Wang (355 papers)
  6. Jing Shao (109 papers)
  7. Chen Change Loy (288 papers)
  8. Lu Sheng (63 papers)
Citations (7)

Summary

We haven't generated a summary for this paper yet.