The Devil is in the Labels: Noisy Label Correction for Robust Scene Graph Generation (2206.03014v1)

Published 7 Jun 2022 in cs.CV

Abstract: Unbiased SGG has achieved significant progress over recent years. However, almost all existing SGG models have overlooked the ground-truth annotation qualities of prevailing SGG datasets, i.e., they always assume: 1) all the manually annotated positive samples are equally correct; 2) all the un-annotated negative samples are absolutely background. In this paper, we argue that both assumptions are inapplicable to SGG: there are numerous "noisy" groundtruth predicate labels that break these two assumptions, and these noisy samples actually harm the training of unbiased SGG models. To this end, we propose a novel model-agnostic NoIsy label CorrEction strategy for SGG: NICE. NICE can not only detect noisy samples but also reassign more high-quality predicate labels to them. After the NICE training, we can obtain a cleaner version of SGG dataset for model training. Specifically, NICE consists of three components: negative Noisy Sample Detection (Neg-NSD), positive NSD (Pos-NSD), and Noisy Sample Correction (NSC). Firstly, in Neg-NSD, we formulate this task as an out-of-distribution detection problem, and assign pseudo labels to all detected noisy negative samples. Then, in Pos-NSD, we use a clustering-based algorithm to divide all positive samples into multiple sets, and treat the samples in the noisiest set as noisy positive samples. Lastly, in NSC, we use a simple but effective weighted KNN to reassign new predicate labels to noisy positive samples. Extensive results on different backbones and tasks have attested to the effectiveness and generalization abilities of each component of NICE.

Citations (65)

View on Semantic Scholar

Summary

The paper introduces NICE, a model-agnostic approach that refines noisy dataset annotations to boost scene graph generation accuracy.
It utilizes negative and positive sample detection modules to identify and correct missing or inconsistent labels via confidence scoring and clustering.
Experimental results on the Visual Genome dataset demonstrate substantial improvements across SGG tasks, mitigating long-tailed biases in models like Motifs and VCTree.

Overview of "The Devil is in the Labels: Noisy Label Correction for Robust Scene Graph Generation"

The paper "The Devil is in the Labels: Noisy Label Correction for Robust Scene Graph Generation" presents a novel approach to improving Scene Graph Generation (SGG) by addressing the inherent issues in existing datasets with noisy annotations. The authors highlight the inadequacy of prevailing assumptions in SGG that all positive labels are correct and all un-annotated samples are background. These assumptions lead to biases that negatively impact the robustness and accuracy of SGG models.

Scene Graph Generation, a pivotal part of understanding visual scenes, involves identifying object instances and their pairwise visual relationships. However, it faces challenges due to the imbalance in dataset annotations, termed the "long-tailed" problem, where certain predicate categories are underrepresented. Traditional techniques to mitigate these biases rely on re-balancing strategies or manipulating pre-trained models, but they often overlook the noise inherent in the dataset labels themselves.

To rectify these issues, the paper proposes NICE (NoIsy label CorrEction strategy), a model-agnostic technique that focuses on refining dataset annotations to enhance SGG. NICE consists of the following components:

Negative Noisy Sample Detection (Neg-NSD): This module identifies missing annotations by treating the detection problem as an out-of-distribution (OOD) challenge. Utilizing a confidence-based model, it assigns pseudo labels to these samples, effectively increasing the dataset’s size with potentially valid but underrepresented samples.
Positive Noisy Sample Detection (Pos-NSD): Using a clustering method based on visual similarity, Pos-NSD identifies inconsistencies among positive samples. It segregates samples into subsets based on their local densities and identifies noisy samples that do not align with the general feature distribution.
Noisy Sample Correction (NSC): Implementing a weighted K-nearest neighbor (wKNN) algorithm, NSC reassigns more consistent labels to identified noisy samples, ensuring that visual patterns align better with their semantic labels.

The authors provide extensive experimental results on the Visual Genome (VG) dataset to validate the effectiveness of NICE. When integrated into state-of-the-art SGG models like Motifs and VCTree, NICE demonstrates substantial improvements, particularly in metrics designed to assess unbiased scene graph generation (mR@K), across multiple tasks such as Predicate Classification, Scene Graph Classification, and Scene Graph Generation.

The implications of this approach are significant. By improving label accuracy, NICE enhances the training dataset, offering more balanced exposure to varied predicate categories. This, in turn, helps models perform better on underrepresented categories, addressing biases inherent in the dataset. Nevertheless, while NICE makes substantial strides in improving dataset quality and model robustness, certain limitations remain, such as the potential inclusion of incorrectly labeled samples and the varying impacts of hyperparameters across different predicate categories.

Looking ahead, the development and refinement of methods like NICE could bring about enhanced training workflows not only in SGG but broader AI applications. Strengthening datasets through label correction will likely become a more prominent aspect of AI model training, addressing reconstruction efficiencies and enabling more generalized learning across various domains.

PDF Markdown

Related Papers

YouTube

Show All Videos