Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
132 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Transferable Interactiveness Knowledge for Human-Object Interaction Detection (2101.10292v3)

Published 25 Jan 2021 in cs.CV, cs.AI, and cs.LG

Abstract: Human-Object Interaction (HOI) detection is an important problem to understand how humans interact with objects. In this paper, we explore interactiveness knowledge which indicates whether a human and an object interact with each other or not. We found that interactiveness knowledge can be learned across HOI datasets and bridge the gap between diverse HOI category settings. Our core idea is to exploit an interactiveness network to learn the general interactiveness knowledge from multiple HOI datasets and perform Non-Interaction Suppression (NIS) before HOI classification in inference. On account of the generalization ability of interactiveness, interactiveness network is a transferable knowledge learner and can be cooperated with any HOI detection models to achieve desirable results. We utilize the human instance and body part features together to learn the interactiveness in hierarchical paradigm, i.e., instance-level and body part-level interactivenesses. Thereafter, a consistency task is proposed to guide the learning and extract deeper interactive visual clues. We extensively evaluate the proposed method on HICO-DET, V-COCO, and a newly constructed PaStaNet-HOI dataset. With the learned interactiveness, our method outperforms state-of-the-art HOI detection methods, verifying its efficacy and flexibility. Code is available at https://github.com/DirtyHarryLYL/Transferable-Interactiveness-Network.

Citations (3)

Summary

  • The paper presents a novel transfer learning method that uses interactiveness knowledge to filter non-interactive human-object pairs and boost HOI detection accuracy.
  • It employs a hierarchical approach by combining instance-level and part-level features with a consistency learning task for refined interactiveness predictions.
  • Experiments on multiple datasets, including a 20.93 mAP on HICO-DET, demonstrate the method's efficiency and adaptability across different HOI detection models.

Overview of Transferable Interactiveness Knowledge for Human-Object Interaction Detection

The paper "Transferable Interactiveness Knowledge for Human-Object Interaction Detection" presents an innovative approach to addressing the Human-Object Interaction (HOI) detection problem by harnessing the concept of interactiveness knowledge. Interactiveness knowledge refers to understanding whether a human and an object interact, independent of specific interaction categories. This insight is crucial for enhancing HOI detection models' efficiency and generalization capabilities.

Core Contributions

The authors propose a novel transfer-learning method that leverages interactiveness knowledge learned across multiple HOI datasets, transcending individual dataset constraints. The approach involves an interactiveness network that performs Non-Interaction Suppression (NIS) before HOI classification, thus filtering out non-interactive human-object pairs and reducing false positives in the inference stage. This method is shown to be applicable across various HOI detection models due to its generalization ability.

Technical Highlights

  1. Hierarchical Interactiveness Paradigm: The framework operates on two levels—instance-level and body part-level interactivenesses—utilizing both whole human instance features and detailed part features to learn interactiveness. This hierarchical approach allows for more granular interactiveness analysis, enhancing model accuracy.
  2. Consistency Learning Task: The paper introduces a consistency constraint between instance-level and part-level interactiveness predictions. This consistency serves as a powerful supervisory signal that strengthens the interactiveness learning process.
  3. Non-Interaction Suppression (NIS): The interactiveness network employs NIS during inference to suppress non-interactive pairs, effectively converting a dense HOI graph into a sparse one. This sparsification step significantly improves the detection model's computational efficiency and accuracy.
  4. Low-Grade Instance Suppressive Function (LIS): This function adjusts the confidence of interactiveness predictions based on the detection scores of humans and objects, enhancing the model's robustness against low-quality detections.

Experimental Results and Claims

The proposed method undergoes extensive evaluation on the HICO-DET, V-COCO, and PaStaNet-HOI datasets. The framework outperforms state-of-the-art methods by a substantial margin, achieving 20.93 mAP on HICO-DET's Default Full set. The authors attribute these results to the efficient transfer of interactiveness knowledge across datasets, particularly benefiting the detection of rare HOI categories.

Implications and Future Directions

The introduction of transferable interactiveness knowledge opens new avenues for improving HOI detection systems' adaptability and scalability. By decoupling interactiveness from specific interaction categories, the methodology simplifies knowledge transfer and reduces annotation demands. Future developments could focus on expanding the scope of interactiveness knowledge to more diverse datasets and refining part-level predictions to handle occlusions and noise better.

Moreover, integrating interactiveness insights with other forms of contextual and semantic information could further enhance HOI detection in complex scenes. The approach may also inspire advancements in related fields, such as action recognition and behavior analysis, where understanding human-object interactions is pivotal.

In conclusion, by framing interactiveness as a transferable attribute, this work significantly contributes to the HOI detection landscape, offering a robust mechanism to bridge dataset-specific limitations and improve model performance on a broader scale.