Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Multiple Instance Detection Network with Online Instance Classifier Refinement (1704.00138v1)

Published 1 Apr 2017 in cs.CV

Abstract: Of late, weakly supervised object detection is with great importance in object recognition. Based on deep learning, weakly supervised detectors have achieved many promising results. However, compared with fully supervised detection, it is more challenging to train deep network based detectors in a weakly supervised manner. Here we formulate weakly supervised detection as a Multiple Instance Learning (MIL) problem, where instance classifiers (object detectors) are put into the network as hidden nodes. We propose a novel online instance classifier refinement algorithm to integrate MIL and the instance classifier refinement procedure into a single deep network, and train the network end-to-end with only image-level supervision, i.e., without object location information. More precisely, instance labels inferred from weak supervision are propagated to their spatially overlapped instances to refine instance classifier online. The iterative instance classifier refinement procedure is implemented using multiple streams in deep network, where each stream supervises its latter stream. Weakly supervised object detection experiments are carried out on the challenging PASCAL VOC 2007 and 2012 benchmarks. We obtain 47% mAP on VOC 2007 that significantly outperforms the previous state-of-the-art.

Overview of "Multiple Instance Detection Network with Online Instance Classifier Refinement"

The paper presents a novel approach to weakly supervised object detection (WSOD) by formulating it through the lens of Multiple Instance Learning (MIL), a framework where instances are paired with inferred classifiers embedded within a network as latent nodes. WSOD poses significant challenges due to limited supervision, relying only on image-level labels without explicit object location annotations. This is compared to fully supervised detections, which utilize precise annotations.

Key Contributions and Methodology

  1. Integration of MIL and Deep Networks: The authors propose a distinct method whereby WSOD is articulated as a MIL problem. Here, the main thrust is to integrate MIL principles with the refinement of instance classifiers within a unified deep network architecture capable of end-to-end training. This involves a pioneering online instance classifier refinement algorithm that iteratively updates the network using classes inferred from weak supervision.
  2. Novel Online Instance Classifier Refinement (OICR): The core of the approach is the OICR algorithm, which eschews the more time-intensive separate iterative strategies typically used in classifier updates, opting instead for a mechanism that concurrently updates classifier weights and labels based on spatial overlaps. This procedure utilizes multiple streaming paths within a deep network framework, each supervising subsequent iterations to bolster the robustness of instance classification.
  3. Empirical Validation: The method is rigorously evaluated on the standard PASCAL VOC 2007 and 2012 benchmarks, attaining a mean Average Precision (mAP) of 47% on the VOC 2007 dataset—markedly surpassing previous leading strategies. These results underscore the system's capacity to achieve more discriminative instance classification, demonstrating the validity of refining classifiers online using proposed spatial relations.

Strong Numerical Results and Claims

The algorithm achieves notable improvements over prior methods. Specifically, the paper reports a substantial enhancement from 29.5% mAP (base network) to 37.9% mAP when incorporating their iterative refinement strategy and shows increased Correct Localization (CorLoc) values, indicative of improved localization effectiveness. The incremental refinement approach thus allows for detection network training that successfully approximates the performance of a fully supervised system, without corresponding annotations.

Implications and Future Directions

The implications of the proposed method are twofold. Practically, it provides a pathway for automated detection systems that require less manual annotation, reducing the resource burden associated with dataset preparation. Theoretically, the integration of a dedicated OICR in weakly supervised learning could catalyze further research into end-to-end learning systems that leverage limited supervision.

The methodology invites further exploration into strengthening classifier reinforcement strategies, perhaps through leveraging contextual information within the refined instance labels, as pointed out by the comparative results with other approaches. Beyond image-based analyses, there may be similar applications in other domains demanding robust object detection with minimal supervision, such as video analysis and medical imaging.

In conclusion, the paper introduces an innovative WSOD framework and algorithm with clear performance advantages over existing strategies, heralding new prospects in the field of minimal supervision learning and its applications in AI.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Peng Tang (47 papers)
  2. Xinggang Wang (163 papers)
  3. Xiang Bai (221 papers)
  4. Wenyu Liu (146 papers)
Citations (422)