Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
133 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

The Role of Learning Algorithms in Collective Action (2405.06582v3)

Published 10 May 2024 in cs.LG, cs.CY, and stat.ML

Abstract: Collective action in machine learning is the study of the control that a coordinated group can have over machine learning algorithms. While previous research has concentrated on assessing the impact of collectives against Bayes (sub-)optimal classifiers, this perspective is limited in that it does not account for the choice of learning algorithm. Since classifiers seldom behave like Bayes classifiers and are influenced by the choice of learning algorithms along with their inherent biases, in this work we initiate the study of how the choice of the learning algorithm plays a role in the success of a collective in practical settings. Specifically, we focus on distributionally robust optimization (DRO), popular for improving a worst group error, and on the ubiquitous stochastic gradient descent (SGD), due to its inductive bias for "simpler" functions. Our empirical results, supported by a theoretical foundation, show that the effective size and success of the collective are highly dependent on properties of the learning algorithm. This highlights the necessity of taking the learning algorithm into account when studying the impact of collective action in machine learning.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (37)
  1. A general class of coefficients of divergence of one distribution from another. Journal of the Royal Statistical Society: Series B (Methodological), 1966.
  2. Differential privacy has disparate impact on model accuracy. Advances in neural information processing systems, 2019.
  3. A convex framework for fair regression. arXiv:1706.02409, 2017.
  4. When users control the algorithms: Values expressed in practices on twitter. In Proceedings of the ACM on Human-Computer Interaction, 2019.
  5. Julie Yujie Chen. Thrown under the bus and outrunning it! the logic of didi and taxi drivers’ labour and activism in the on-demand economy. New Media & Society, 2018.
  6. Deep reinforcement learning from human preferences. Advances in neural information processing systems, 2017.
  7. Imre Csiszár. On information-type measure of difference of probability distributions and indirect observations. Studia Sci. Math. Hungar., 1967.
  8. Distributionally robust optimization under moment uncertainty with application to data-drivenproblems. Operations Research, 2010.
  9. Variance-based regularization with convex objectives. Journal of Machine Learning Research, 2019.
  10. Learning models with uniform performance via distributionally robust optimization. The Annals of Statistics, 2021.
  11. The like economy: Social buttons and the data-intensive web. New media & society, 2013.
  12. Algorithmic collective action in machine learning. In International Conference on Machine Learning, volume 2022, 2023.
  13. Fairness without demographics in repeated loss minimization. In Proceedings of the 35th International Conference on Machine Learning, Proceedings of Machine Learning Research, 2018.
  14. Benchmarking Neural Network Robustness to Common Corruptions and Perturbations. In International Conference on Learning Representations, 2019.
  15. The origins and prevalence of texture bias in convolutional neural networks. In Advances in Neural Information Processing Systems, 2020.
  16. Sgd on neural networks learns functions of increasing complexity. In Advances in Neural Information Processing Systems, 2019.
  17. Wilds: A benchmark of in-the-wild distribution shifts. In International Conference on Machine Learning, 2021.
  18. Alex Krizhevsky. Learning multiple layers of features from tiny images, 2009.
  19. The inductive bias of in-context learning: Rethinking pretraining example design. In International Conference on Learning Representations, 2022.
  20. Large-scale methods for distributionally robust optimization. In Advances in Neural Information Processing Systems, 2020.
  21. Just train twice: Improving group robustness without training group information. In Proceedings of the 38th International Conference on Machine Learning, 2021.
  22. Towards deep learning models resistant to adversarial attacks. In International Conference on Learning Representations, 2018.
  23. Maximin effects in inhomogeneous large-scale data. The Annals of Statistics, 2015.
  24. Learning from failure: De-biasing classifier from biased classifier. In Advances in Neural Information Processing Systems, 2020.
  25. Stochastic gradient methods for distributionally robust optimization with f-divergences. In Advances in Neural Information Processing Systems, 2016.
  26. Mancur Olson. The logic of collective action. Contemporary Sociological Theory, 1965.
  27. A law of adversarial risk, interpolation, and label noise. In International Conference on Learning Representations, 2023.
  28. Toward a better trade-off between performance and fairness with kernel-based distribution matching. arXiv:1910.11779, 2019.
  29. Direct preference optimization: Your language model is secretly a reward model. Advances in Neural Information Processing Systems, 2024.
  30. Hatim A. Rahman. The invisible cage: Workers’ reactivity to opaque algorithmic evaluations. Administrative Science Quarterly, 2021.
  31. Distributionally Robust Neural Networks. In International Conference on Learning Representations, 2020.
  32. How unfair is private learning? In Uncertainty in Artificial Intelligence, 2022.
  33. The pitfalls of simplicity bias in neural networks. In Advances in Neural Information Processing Systems, 2020.
  34. How robust is unsupervised representation learning to distribution shift? In The Eleventh International Conference on Learning Representations, 2023.
  35. Frapp\\\backslash\’e: A post-processing framework for group fairness regularization. In International Conference on Machine Learning, 2024.
  36. Examining the inductive bias of neural language models with artificial languages. In Annual Meeting of the Association for Computational Linguistics, 2021.
  37. Adversarial training for high-stakes reliability. Advances in Neural Information Processing Systems, 2022.
Citations (2)

Summary

  • The paper finds that algorithm choice significantly alters collective influence, with DRO sometimes enabling smaller groups to outperform larger ones.
  • It demonstrates that SGD's simplicity bias can be strategically exploited by collectives targeting complex data features to manipulate model outcomes.
  • The study introduces the 'effective collective size' metric, showing how algorithm-induced weight adjustments affect a group's overall success.

Exploring the Impact of Learning Algorithms on Collective Action in Machine Learning

Understanding Algorithmic Impact on Collective Strategy

This paper kicks off an interesting conversation on the impact of learning algorithm choices on the efficacy of collective actions within machine learning models. Fundamentally, the paper hinges on how different algorithms alter a collective's influence when they endeavor to manipulate model outputs by adjusting the data they provide.

Dive into Specific Algorithms: DRO and SGD

Two main types of algorithms were dissected in this paper:

  1. Distributionally Robust Optimization (DRO): These algorithms optimize worst-case performance across subgroups of data, which becomes critical for ensuring fairness. The paper reveals a counterintuitive result: smaller collectives may achieve greater success under DRO compared to larger ones.
  2. Stochastic Gradient Descent (SGD): This common training methodology inherently prefers simpler solutions, which the paper posits can be exploited by a collective to achieve higher success by targeting the overlooked complex features of a learning model.

Experimenting with Weights and Validation Control

  • The concept of effective collective size was introduced. Unlike simple counting of collective members, this metric considers how algorithmically-induced weighting adjustments can amplify or diminish the influence of data points within the collective.
  • Through experiments, it was observed that certain algorithmic adjustments intended to prioritize weaker data signals (like two-stage re-weighting algorithms JTT and LFF) can inadvertently bolster the collective's influence when the collective is relatively small.
  • Particularly in models employing iterative re-weighting approaches like DRO, the point at which training is halted—often dictated by performance on a validation set—can dramatically affect collective success. If a collective has sway over this validation set, their ability to manipulate outcomes can increase significantly.

Leveraging Algorithm Bias

Switching gears, the paper explores strategic manipulations a collective could employ by exploiting known biases of algorithms, particularly in terms of complexity and simplicity:

  • Simplicity Bias in SGD: With SGD's inclination toward simpler models, a collective can increase success by embedding signals in complex aspects of data that SGD likely eschews.
  • Complexity Leverage: By tuning the complexity of the signal (such as embedding signals within more complex data features), collectives can potentially raise their success rates, particularly against algorithms that tend to ignore these complexities during training.

Theoretical and Practical Implications

The research provides compelling theoretical underpinnings to show that the choice of learning algorithm plays a critical role in determining the success of collective actions. From a practical standpoint, knowing these dynamics can help entities in designing better collective strategies or in fortifying algorithms against unwanted manipulations.

Future Avenues

The paper sets the stage for several intriguing areas:

  • Further analysis across diverse algorithms: It would be insightful to extend this examination to other popular algorithms and investigate if similar impacts on collective action are evident.
  • Role of collective information level: Investigating how the amount of information a collective has about the learning algorithm affects their ability to successfully manipulate outcomes could yield valuable insights.
  • Broader impacts on fairness and bias: There's potential to explore how these findings interact with broader issues of fairness, privacy, and robustness in machine-learning models.

In sum, this paper initiates crucial discourse on the intersection of algorithm choice and collective action efficacy in machine learning, proposing both practical guidelines for leveraging known biases and theoretical frameworks to understand these dynamics further. As machine learning continues to evolve, understanding these relationships will be paramount in harnessing or defending against collective actions within AI systems.

Youtube Logo Streamline Icon: https://streamlinehq.com