Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Detecting the open-world objects with the help of the Brain (2303.11623v1)

Published 21 Mar 2023 in cs.CV

Abstract: Open World Object Detection (OWOD) is a novel computer vision task with a considerable challenge, bridging the gap between classic object detection (OD) benchmarks and real-world object detection. In addition to detecting and classifying seen/known objects, OWOD algorithms are expected to detect unseen/unknown objects and incrementally learn them. The natural instinct of humans to identify unknown objects in their environments mainly depends on their brains' knowledge base. It is difficult for a model to do this only by learning from the annotation of several tiny datasets. The large pre-trained grounded language-image models - VL (\ie GLIP) have rich knowledge about the open world but are limited to the text prompt. We propose leveraging the VL as the Brain'' of the open-world detector by simply generating unknown labels. Leveraging it is non-trivial because the unknown labels impair the model's learning of known objects. In this paper, we alleviate these problems by proposing the down-weight loss function and decoupled detection structure. Moreover, our detector leverages theBrain'' to learn novel objects beyond VL through our pseudo-labeling scheme.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Shuailei Ma (13 papers)
  2. Yuefeng Wang (6 papers)
  3. Ying Wei (80 papers)
  4. Peihao Chen (28 papers)
  5. Zhixiang Ye (2 papers)
  6. Jiaqi Fan (12 papers)
  7. Enming Zhang (14 papers)
  8. Thomas H. Li (32 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.