Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 59 tok/s

Gemini 2.5 Pro 49 tok/s Pro

GPT-5 Medium 32 tok/s Pro

GPT-5 High 33 tok/s Pro

GPT-4o 127 tok/s Pro

Kimi K2 189 tok/s Pro

GPT OSS 120B 421 tok/s Pro

Claude Sonnet 4.5 36 tok/s Pro

2000 character limit reached

Graph convolutional networks for learning with few clean and many noisy labels (1910.00324v3)

Published 1 Oct 2019 in cs.CV and cs.LG

Abstract: In this work we consider the problem of learning a classifier from noisy labels when a few clean labeled examples are given. The structure of clean and noisy data is modeled by a graph per class and Graph Convolutional Networks (GCN) are used to predict class relevance of noisy examples. For each class, the GCN is treated as a binary classifier, which learns to discriminate clean from noisy examples using a weighted binary cross-entropy loss function. The GCN-inferred "clean" probability is then exploited as a relevance measure. Each noisy example is weighted by its relevance when learning a classifier for the end task. We evaluate our method on an extended version of a few-shot learning problem, where the few clean examples of novel classes are supplemented with additional noisy data. Experimental results show that our GCN-based cleaning process significantly improves the classification accuracy over not cleaning the noisy data, as well as standard few-shot classification where only few clean examples are used.

Citations (18)

View on Semantic Scholar

Summary

The paper integrates few clean examples with many noisy ones by weighting noisy labels based on their inferred relevance using graph convolutional networks.
The approach constructs an affinity graph and applies binary classification to propagate label information effectively across samples.
Empirical results on ImageNet and Places365 demonstrate significant accuracy improvements in few-shot learning scenarios.

Exploring Graph Convolutional Networks for Few-Shot Learning with Noisy Labels

The paper presents an innovative approach to leverage Graph Convolutional Networks (GCNs) for the task of learning from data with a few clean labels and a substantial amount of noisy labels. This approach addresses the challenge of learning robust classifiers in the presence of labeling inaccuracies, which is particularly relevant when gathering large-scale data from sources where manual verification is impractical, such as online image repositories.

Core Contributions

The authors propose a method that uses the inherent connectivity in data, represented as a graph, to differentiate between clean and noisy labels across multiple classes. Specifically, they utilize GCNs to predict a "clean" probability for each noisy sample, which acts as a measure of relevance for these samples in the final classifier training. The architecture of the GCN is adapted to function as a binary classifier on a class-by-class basis, where each class forms its own binary classification problem.

Key contributions are:

Integration of Few Clean Examples with Noisy Data: The method cleverly integrates a few clean examples with numerous noisy examples by weighting the latter based on their inferred relevance. This allows for the effective use of noisy datasets in building classifiers without compromising the accuracy provided by clean data.
Graph-Based Label Cleaning via GCNs: This work is pioneering in utilizing GCNs for cleaning noise in labels. The GCNs operate by learning to differentiate between clean and noisy labels and generate a probability of cleanliness that can be interpreted as a relevance score.
Superior Performance in Low-Shot Learning Scenarios: The proposed method was tested against extended few-shot learning scenarios with promising results, significantly improving classification accuracy compared to isolated few-shot learning or unfiltered noisy input methods.

Methodological Insights

The researchers constructed an affinity graph to model the relationships between clean and noisy samples. In this graph, nodes represent samples while edges indicate visual similarity connections between these samples. The authors then employed a GCN to propagate label information across this graph, effectively boosting the signal from the few clean samples to correct potential mistakes in the noisy samples. The binary classification view of GCNs was particularly beneficial, as it enabled straightforward learning and inference of the clean probabilities using weighted binary cross-entropy loss.

Experimental Evaluation

The proposed approach was evaluated using variants of few-shot learning problems, with datasets such as the ImageNet and Places365 benchmarks. The empirical results demonstrate that the GCN-based label cleaning process greatly enhances classification accuracy. Notably, on ImageNet's novel classes with only one clean label, the approach considerably outperformed previous methodologies, demonstrating the potential of GCNs in addressing the challenges brought by few-shots and noisy labels.

Furthermore, the method shows its general applicability and superiority in situations with larger numbers of classes and varying quantities of available auxiliary noisy data. Comparisons with tradition methods and recent advancements in label noise handling attested to the robustness and versatility of the proposed solution.

Implications and Future Directions

This research significantly contributes to the field of machine learning, especially in applications where collecting large amounts of clean labeled data is infeasible. The reliance on noisy yet extensive datasets creates opportunities for employing web-crawled data efficiently.

The authors suggest that further research could focus on integrating more sophisticated graph constructions or exploring additional GCN architectures to better handle diverse datasets. Moreover, the principles outlined could be adapted to other domains where label noise is prevalent, enhancing the broader applicability of graph-based learning models in noisy environments.

In conclusion, this work reinforces the valuable role of GCNs in enhancing robustness and accuracy in classification tasks, paving the way for future innovations in AI where data veracity is often compromised.