Revisiting Multiple Instance Neural Networks (1610.02501v1)

Published 8 Oct 2016 in stat.ML and cs.LG

Abstract: Recently neural networks and multiple instance learning are both attractive topics in Artificial Intelligence related research fields. Deep neural networks have achieved great success in supervised learning problems, and multiple instance learning as a typical weakly-supervised learning method is effective for many applications in computer vision, biometrics, nature language processing, etc. In this paper, we revisit the problem of solving multiple instance learning problems using neural networks. Neural networks are appealing for solving multiple instance learning problem. The multiple instance neural networks perform multiple instance learning in an end-to-end way, which take a bag with various number of instances as input and directly output bag label. All of the parameters in a multiple instance network are able to be optimized via back-propagation. We propose a new multiple instance neural network to learn bag representations, which is different from the existing multiple instance neural networks that focus on estimating instance label. In addition, recent tricks developed in deep learning have been studied in multiple instance networks, we find deep supervision is effective for boosting bag classification accuracy. In the experiments, the proposed multiple instance networks achieve state-of-the-art or competitive performance on several MIL benchmarks. Moreover, it is extremely fast for both testing and training, e.g., it takes only 0.0003 second to predict a bag and a few seconds to train on a MIL datasets on a moderate CPU.

Authors (5)

Xinggang Wang (163 papers)
Yongluan Yan (2 papers)
Peng Tang (47 papers)
Xiang Bai (222 papers)
Wenyu Liu (146 papers)

Citations (413)

View on Semantic Scholar

Summary

An Analysis of "Revisiting Multiple Instance Neural Networks"

In the contemporary landscape of machine learning research, the paper titled "Revisiting Multiple Instance Neural Networks" by Xinggang Wang et al. addresses the challenge of Multiple Instance Learning (MIL) with neural networks, a topic that has garnered substantial interest due to its relevance in numerous weakly-supervised learning scenarios prevalent in applications ranging from computer vision to biometrics.

Overview

The authors investigate the synergy between deep neural networks and MIL, proposing a Multiple Instance Neural Network (MINN) framework that performs end-to-end learning. This approach diverges from traditional MIL techniques that either focus on instance classification within a bag or rely on bag embeddings derived from instance features. The central thrust of this research is the development of a network architecture capable of effectively learning bag representations that optimize classification performance, distinguishing it from methods that attempt to infer instance labels.

Proposed Approach

The paper introduces two primary architectures: mi-Net and MI-Net. The former aligns with the instance-space paradigm, as it seeks to infer instance probabilities before aggregating them to predict the bag label. The latter is innovative in its focus on directly learning a fixed-length bag representation using a series of fully connected layers and MIL Pooling Layer (MPL) operations, which eschew the need for explicit instance probability estimates.

Methodological Innovations

Among the notable contributions, the paper includes the:

Introduction of deep supervision: This involves injecting additional loss functions at intermediate layers, enhancing the learning of hierarchical representations within the network.
Employment of residual connections: Inspired by the success of residual networks, these connections are designed to facilitate the learning of bag-level representations, although the results on implementing such connections present mixed outcomes, suggesting room for further fine-tuning or exploration of additional network configurations.

The networks make use of differentiable pooling strategies—max pooling, mean pooling, and log-sum-exp pooling—to produce bag representations, each contributing differently to performance outcomes across various datasets.

Experimental Findings

The empirical evaluation spans across five MIL benchmarks, including MUSK and several vision datasets, where the proposed networks achieve either state-of-the-art or competitive results. Particularly, the MI-Net with deep supervision outperformed traditional approaches on most datasets, showcasing the potency of the new method.

The networks demonstrated not only superior predictive capability but also remarkable efficiency, with computational demands significantly reduced in both training and inferencing stages compared to some traditional MIL techniques.

Implications and Future Directions

This paper's contributions hold substantial implications for MIL research, reinforcing the applicability of neural networks in settings where incomplete instance-level information is the norm. The proposed architectures simplify the learning process by treating bags as holistic entities, thereby bypassing the need for detailed instance annotation.

Future work might explore further enhancements of the MIL pooling layer or experiment with deeper or wider network configurations to see if they translate the benefits observed in other deep learning tasks to the MIL space. Additionally, the findings could spur advancements in weakly-supervised learning paradigms where instance ambiguity presents significant challenges.

In conclusion, this research marks an important step forward in the neural network-assisted exploration of MIL, laying groundwork for subsequent advances in the field and potentially influencing a broad spectrum of AI applications where weak supervision is prevalent.

PDF Markdown

Related Papers

Find Related Papers