Papers
Topics
Authors
Recent
2000 character limit reached

Power-up! What Can Generative Models Do for Human Computation Workflows? (2307.02243v1)

Published 5 Jul 2023 in cs.HC and cs.AI

Abstract: We are amidst an explosion of artificial intelligence research, particularly around LLMs. These models have a range of applications across domains like medicine, finance, commonsense knowledge graphs, and crowdsourcing. Investigation into LLMs as part of crowdsourcing workflows remains an under-explored space. The crowdsourcing research community has produced a body of work investigating workflows and methods for managing complex tasks using hybrid human-AI methods. Within crowdsourcing, the role of LLMs can be envisioned as akin to a cog in a larger wheel of workflows. From an empirical standpoint, little is currently understood about how LLMs can improve the effectiveness of crowdsourcing workflows and how such workflows can be evaluated. In this work, we present a vision for exploring this gap from the perspectives of various stakeholders involved in the crowdsourcing paradigm -- the task requesters, crowd workers, platforms, and end-users. We identify junctures in typical crowdsourcing workflows at which the introduction of LLMs can play a beneficial role and propose means to augment existing design patterns for crowd work.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (45)
  1. Persistent anti-muslim bias in large language models. In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society. 298–306.
  2. Docbert: Bert for document classification. arXiv preprint arXiv:1904.08398 (2019).
  3. Do as i can, not as i say: Grounding language in robotic affordances. arXiv preprint arXiv:2204.01691 (2022).
  4. Publicly Available Clinical BERT Embeddings. NAACL HLT 2019 (2019), 72.
  5. Models in the loop: Aiding crowdworkers with generative annotation assistants. arXiv preprint arXiv:2112.09062 (2021).
  6. Soylent: a word processor with a crowd inside. In Proceedings of the 23nd annual ACM symposium on User interface software and technology. 313–322.
  7. COMET: Commonsense transformers for automatic knowledge graph construction. arXiv preprint arXiv:1906.05317 (2019).
  8. Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877–1901.
  9. CrowdMR: Integrating crowdsourcing with MapReduce for AI-hard problems. In Twenty-Ninth AAAI Conference on Artificial Intelligence.
  10. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
  11. Calibrating Factual Knowledge in Pretrained Language Models. arXiv preprint arXiv:2210.03329 (2022).
  12. Enrique Estellés-Arolas and Fernando González-Ladrón-de Guevara. 2012. Towards an integrated crowdsourcing definition. Journal of Information science 38, 2 (2012), 189–200.
  13. Nat Friedman. 2021. Copilot: Your AI pair programmer. https://github.blog/2021-06-29-introducing-github-copilot-ai-pair-programmer/
  14. A taxonomy of microtasks on the web. In Proceedings of the 25th ACM conference on Hypertext and social media. 218–223.
  15. Ujwal Gadiraju and Jie Yang. 2020. What can crowd computing do for the next generation of AI systems?. In 2020 Crowd Science Workshop: Remoteness, Fairness, and Mechanisms as Challenges of Data Supply by Humans for Automation. CEUR, 7–13.
  16. Sparks: Inspiration for science writing using language models. In Designing Interactive Systems Conference. 1002–1019.
  17. End-to-end neural pipeline for goal-oriented dialogue systems using GPT-2. In Proceedings of the 58th annual meeting of the association for computational linguistics. 583–592.
  18. Learning-to-Rank with BERT in TF-Ranking. arXiv preprint arXiv:2004.08476 (2020).
  19. On the Compositional Generalization Gap of In-Context Learning. arXiv preprint arXiv:2211.08473 (2022).
  20. CrowdWeaver: visually managing complex crowd work. In Proceedings of the ACM 2012 Conference on Computer Supported Cooperative Work. 1033–1036.
  21. Crowdforge: Crowdsourcing complex work. In Proceedings of the 24th annual ACM symposium on User interface software and technology. 43–52.
  22. Hierarchical BERT with an adaptive fine-tuning strategy for document classification. Knowledge-Based Systems 238 (2022), 107872.
  23. Jieh-Sheng Lee and Jieh Hsiang. 2020. Patent claim generation by fine-tuning OpenAI GPT-2. World Patent Information 62 (2020), 101983.
  24. Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461 (2019).
  25. Do Language Models Learn Commonsense Knowledge? arXiv preprint arXiv:2111.00607 (2021).
  26. Exploring iterative and parallel human computation processes. In Proceedings of the ACM SIGKDD workshop on human computation. 68–76.
  27. Wanli: Worker and ai collaboration for natural language inference dataset creation. arXiv preprint arXiv:2201.05955 (2022).
  28. AI-based language models powering drug discovery and development. Drug Discovery Today 26, 11 (2021), 2593–2607.
  29. Li Lucy and David Bamman. 2021. Gender and representation bias in GPT-3 generated stories. In Proceedings of the Third Workshop on Narrative Understanding. 48–55.
  30. StereoSet: Measuring stereotypical bias in pretrained language models. arXiv preprint arXiv:2004.09456 (2020).
  31. Rodrigo Nogueira and Kyunghyun Cho. 2019. Passage Re-ranking with BERT. arXiv preprint arXiv:1901.04085 (2019).
  32. TB OpenAI. 2022. Chatgpt: Optimizing language models for dialogue. OpenAI (2022).
  33. Explain Yourself! Leveraging Language Models for Commonsense Reasoning. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 4932–4942.
  34. No workflow can ever be enough: How crowdsourcing workflows constrain complex work. Proceedings of the ACM on Human-Computer Interaction 1, CSCW (2017), 1–23.
  35. ” Why should i trust you?” Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 1135–1144.
  36. Anchors: High-precision model-agnostic explanations. In Proceedings of the AAAI conference on artificial intelligence, Vol. 32.
  37. Jennifer Wortman Vaughan. 2017. Making Better Use of the Crowd: How Crowdsourcing Can Advance Machine Learning Research. J. Mach. Learn. Res. 18, 1 (2017), 7026–7071.
  38. Investigating gender bias in language models using causal mediation analysis. Advances in neural information processing systems 33 (2020), 12388–12401.
  39. Towards Process-Oriented, Modular, and Versatile Question Generation that Meets Educational Needs. arXiv preprint arXiv:2205.00355 (2022).
  40. Chain of thought prompting elicits reasoning in large language models. arXiv preprint arXiv:2201.11903 (2022).
  41. Polyjuice: Generating counterfactuals for explaining, evaluating, and improving models. arXiv preprint arXiv:2101.00288 (2021).
  42. Roman V Yampolskiy. 2012. AI-complete, AI-hard, or AI-easy–classification of problems in AI. In The 23rd Midwest Artificial Intelligence and Cognitive Science Conference, Cincinnati, OH, USA.
  43. Finbert: A pretrained language model for financial communications. arXiv preprint arXiv:2006.08097 (2020).
  44. Wordcraft: story writing with large language models. In 27th International Conference on Intelligent User Interfaces. 841–852.
  45. Julia El Zini and Mariette Awad. 2022. On the explainability of natural language processing deep models. Comput. Surveys 55, 5 (2022), 1–31.
Citations (3)

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Paper to Video (Beta)

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.