Diversified Ensembling: An Experiment in Crowdsourced Machine Learning (2402.10795v1)
Abstract: Crowdsourced machine learning on competition platforms such as Kaggle is a popular and often effective method for generating accurate models. Typically, teams vie for the most accurate model, as measured by overall error on a holdout set, and it is common towards the end of such competitions for teams at the top of the leaderboard to ensemble or average their models outside the platform mechanism to get the final, best global model. In arXiv:2201.10408, the authors developed an alternative crowdsourcing framework in the context of fair machine learning, in order to integrate community feedback into models when subgroup unfairness is present and identifiable. There, unlike in classical crowdsourced ML, participants deliberately specialize their efforts by working on subproblems, such as demographic subgroups in the service of fairness. Here, we take a broader perspective on this work: we note that within this framework, participants may both specialize in the service of fairness and simply to cater to their particular expertise (e.g., focusing on identifying bird species in an image classification task). Unlike traditional crowdsourcing, this allows for the diversification of participants' efforts and may provide a participation mechanism to a larger range of individuals (e.g. a machine learning novice who has insight into a specific fairness concern). We present the first medium-scale experimental evaluation of this framework, with 46 participating teams attempting to generate models to predict income from American Community Survey data. We provide an empirical analysis of teams' approaches, and discuss the novel system architecture we developed. From here, we give concrete guidance for how best to deploy such a framework.
- Benetech - Making Graphs Accessible. https://kaggle.com/competitions/benetech-making-graphs-accessible
- Avrim Blum and Moritz Hardt. 2015. The Ladder: A Reliable Leaderboard for Machine Learning Competitions. In Proceedings of the 32nd International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 37), Francis Bach and David Blei (Eds.). PMLR, Lille, France, 1006–1014. https://proceedings.mlr.press/v37/blum15.html
- ICR - Identifying Age-Related Conditions. https://kaggle.com/competitions/icr-identify-age-related-conditions
- US Census. October 20, 2022. 2021 ACS PUMS Data Dictionary. Data Dictionary. US Census. https://www2.census.gov/programs-surveys/acs/tech_docs/pums/data_dict/PUMS_Data_Dictionary_2021.pdf
- Democratizing data science. In Proceedings of the KDD 2014 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA. 24–27.
- Google - American Sign Language Fingerspelling Recognition. https://kaggle.com/competitions/asl-fingerspelling
- Image Matching Challenge 2023. https://kaggle.com/competitions/image-matching-challenge-2023
- Rumman Chowdhury and Jutta Williams. 2021. Introducing Twitter’s First Algorithmic Bias Bounty Challenge. https://blog.twitter.com/engineering/en_us/topics/insights/2021/algorithmic-bias-bounty-challenge
- Minimax group fairness: Algorithms and experiments. In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society. 66–76.
- Retiring Adult: New Datasets for Fair Machine Learning. Advances in Neural Information Processing Systems 34 (2021).
- Predict Student Performance from Game Play. https://kaggle.com/competitions/predict-student-performance-from-game-play
- An algorithmic framework for bias bounties. In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency. 1106–1124.
- GoDaddy - Microbusiness Density Forecasting. https://kaggle.com/competitions/godaddy-microbusiness-density-forecasting
- HuBMAP - Hacking the Human Vasculature. https://kaggle.com/competitions/hubmap-hacking-the-human-vasculature
- Leakage in data mining: Formulation, detection, and avoidance. ACM Transactions on Knowledge Discovery from Data (TKDD) 6, 4 (2012), 1–21.
- Using the Open Meta Kaggle Dataset to Evaluate Tripartite Recommendations in Data Markets. arXiv preprint arXiv:1908.04017 (2019).
- Vesuvius Challenge - Ink Detection. https://kaggle.com/competitions/vesuvius-challenge-ink-detection
- Minimax pareto fairness: A multi objective perspective. In International Conference on Machine Learning. PMLR, 6755–6764.
- Link prediction by de-anonymization: How We Won the Kaggle Social Network Challenge. In The 2011 International Joint Conference on Neural Networks. 1825–1834. https://doi.org/10.1109/IJCNN.2011.6033446
- Google Research - Identify Contrails to Reduce Global Warming. https://kaggle.com/competitions/google-research-identify-contrails-reduce-global-warming
- Amit Elazari Bar On. 2018. We Need Bug Bounties for Bad Algorithms. https://www.vice.com/en/article/8xkyj3/we-need-bug-bounties-for-bad-algorithms
- A meta-analysis of overfitting in machine learning. Advances in Neural Information Processing Systems 32 (2019).
- Austin Carson Sven Cattell, Rumman Chowdhury. 2023. https://aivillage.org/generative%20red%20team/generative-red-team/
- Christopher J Tosh and Daniel Hsu. 2022. Simple and near-optimal algorithms for hidden stratification and multi-group learning. In International Conference on Machine Learning. PMLR, 21633–21657.
- 2023 Kaggle AI Report. https://kaggle.com/competitions/2023-kaggle-ai-report