Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
129 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Deep learning from crowds (1709.01779v2)

Published 6 Sep 2017 in stat.ML, cs.CV, cs.HC, and cs.LG

Abstract: Over the last few years, deep learning has revolutionized the field of machine learning by dramatically improving the state-of-the-art in various domains. However, as the size of supervised artificial neural networks grows, typically so does the need for larger labeled datasets. Recently, crowdsourcing has established itself as an efficient and cost-effective solution for labeling large sets of data in a scalable manner, but it often requires aggregating labels from multiple noisy contributors with different levels of expertise. In this paper, we address the problem of learning deep neural networks from crowds. We begin by describing an EM algorithm for jointly learning the parameters of the network and the reliabilities of the annotators. Then, a novel general-purpose crowd layer is proposed, which allows us to train deep neural networks end-to-end, directly from the noisy labels of multiple annotators, using only backpropagation. We empirically show that the proposed approach is able to internally capture the reliability and biases of different annotators and achieve new state-of-the-art results for various crowdsourced datasets across different settings, namely classification, regression and sequence labeling.

Citations (246)

Summary

  • The paper introduces a novel crowd layer for deep neural networks that learns directly from noisy crowdsourced labels, effectively mitigating annotator biases.
  • It leverages a modified network architecture to bypass computationally intensive EM iterations while supporting tasks from classification to regression.
  • Empirical results on diverse datasets demonstrate state-of-the-art accuracy and robustness, highlighting the practical impact of the proposed method.

Deep Learning from Crowds: A Comprehensive Overview

The paper "Deep Learning from Crowds" by Filipe Rodrigues and Francisco C. Pereira addresses the critical challenge of training deep neural networks (DNNs) using crowdsourced data. As deep learning continues to advance, the demand for extensive labeled datasets has become more pronounced. Crowdsourcing has emerged as a cost-effective approach to satisfy this demand, albeit with the challenge of dealing with noise and variability in annotations stemming from differing levels of annotator expertise.

Key Contributions

The authors present a novel approach to learning from crowds that combines deep learning with crowdsourcing in a unified framework. They introduce a crowd layer that allows DNNs to be trained end-to-end using noisy labels directly from multiple annotators. This is accomplished through a straightforward modification to the network architecture, enabling backpropagation to account for annotator reliability and biases without the need for computationally intensive iterative methods like Expectation-Maximization (EM).

EM Algorithm

Initially, the paper explores a conventional strategy of leveraging an EM algorithm to jointly learn the parameters of the DNN and the reliability of annotators in a multi-class classification setting. This method involves estimating ground-truth labels as latent variables, which offers improvement over simple aggregation methods like majority voting. However, the paper notes practical limitations including computational overhead and restrictions in extending beyond classification to regression or structured output tasks.

Crowd Layer

The crowd layer is advanced as an elegant solution to these constraints. This network layer facilitates the training process by modeling annotator-specific transformations, such as biases and weights, directly in the network structure. As a result, the network can dynamically learn to mitigate annotator errors during training, providing a more flexible and efficient approach compared to previous methods which often require complicated preprocessing or postprocessing steps.

Empirical Evaluation

Significant empirical validation is presented across diverse datasets and tasks. The authors demonstrate that their crowd layer approach yields state-of-the-art results on real-world datasets from Amazon Mechanical Turk across various applications, including image classification, text regression, and named entity recognition. The results underscore the ability of the crowd layer model to autonomously distinguish between reliable and unreliable annotations, adjust for biases, and ultimately improve model accuracy compared to previous methods such as EM-based approaches or aggregation techniques like majority voting.

Implications and Future Directions

The practical and theoretical implications of this work are multifaceted. On the one hand, it underscores the potential for integrating noisy, real-world data into sophisticated DNN architectures without onerous preprocessing, thereby reducing the barrier to deploying deep learning solutions in environments where clean data is scarce or expensive. On the other hand, the model's simplicity belies its potential versatility, with straightforward adaptations possible for other kinds of tasks beyond multi-class classification.

In terms of future developments, this research lays the groundwork for further explorations into robust learning algorithms that can accommodate a wider variety of noisy or biased data sources without sacrificing performance. Additionally, the interplay between crowd layer strategies and other architectures within the ecosystem of deep learning merits ongoing exploration.

Overall, the paper makes a substantive contribution to the arena of leveraging crowdsourced data for deep learning, providing a viable path forward for handling annotation noise with elegance and computational efficiency. The introduced methodologies not only advance current capabilities but also offer a scaffold for future innovation in the flexible training of neural networks on imperfect real-world data.