Opportunities and Risks of LLMs for Scalable Deliberation with Polis (2306.11932v1)
Abstract: Polis is a platform that leverages machine intelligence to scale up deliberative processes. In this paper, we explore the opportunities and risks associated with applying LLMs towards challenges with facilitating, moderating and summarizing the results of Polis engagements. In particular, we demonstrate with pilot experiments using Anthropic's Claude that LLMs can indeed augment human intelligence to help more efficiently run Polis conversations. In particular, we find that summarization capabilities enable categorically new methods with immense promise to empower the public in collective meaning-making exercises. And notably, LLM context limitations have a significant impact on insight and quality of these results. However, these opportunities come with risks. We discuss some of these risks, as well as principles and techniques for characterizing and mitigating them, and the implications for other deliberative or political systems that may employ LLMs. Finally, we conclude with several open future research directions for augmenting tools like Polis with LLMs.
- The Society Library, Mar. 2023. URL https://www.societylibrary.org.
- Persistent anti-muslim bias in large language models. In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society, AIES ’21, page 298–306, New York, NY, USA, 2021a. Association for Computing Machinery. ISBN 9781450384735. doi: 10.1145/3461702.3462624. URL https://doi.org/10.1145/3461702.3462624.
- Large language models associate Muslims with violence. Nature Machine Intelligence, 3(6):461–463, June 2021b. ISSN 2522-5839. doi: 10.1038/s42256-021-00359-2. URL https://www.nature.com/articles/s42256-021-00359-2. Number: 6 Publisher: Nature Publishing Group.
- A. Applebaum and P. Pomerantsev. How to Put Out Democracy’s Dumpster Fire. The Atlantic, Mar. 2021. URL https://www.theatlantic.com/magazine/archive/2021/04/the- internet-doesnt-have-to-be-awful/618079/. Section: Ideas.
- A General Language Assistant as a Laboratory for Alignment, Dec. 2021. URL http://arxiv.org/abs/2112.00861. arXiv:2112.00861 [cs].
- Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback, Apr. 2022. URL http://arxiv.org/abs/2204.05862. arXiv:2204.05862 [cs].
- Fine-tuning language models to find agreement among humans with diverse preferences, Nov. 2022. URL https://arxiv.org/abs/2211.15006.
- E. Barry. Townhall meeting in Kentucky turns tables on polarization, 2023. URL https://compdemocracy.org/Case-studies/2018-kentucky/.
- Evaluating the Underlying Gender Bias in Contextualized Word Embeddings, Apr. 2019. URL http://arxiv.org/abs/1904.08783. arXiv:1904.08783 [cs].
- On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’21, page 610–623, New York, NY, USA, 2021a. Association for Computing Machinery. ISBN 9781450383097. doi: 10.1145/3442188.3445922. URL https://doi.org/10.1145/3442188.3445922.
- On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’21, pages 610–623, New York, NY, USA, Mar. 2021b. Association for Computing Machinery. ISBN 978-1-4503-8309-7. doi: 10.1145/3442188.3445922. URL https://dl.acm.org/doi/10.1145/3442188.3445922.
- Faster Peace via Inclusivity: An Efficient Paradigm to Understand Populations in Conflict Zones. AI for Social Good workshop at NeurIPS, page 6, 2019.
- On the opportunities and risks of foundation models, 2022.
- Language models are few-shot learners. arXiv preprint arXiv:2005.14165, 2020.
- Marked personas: Using natural language prompts to measure stereotypes in language models, 2023.
- Detecting Hate Speech with GPT-3, Mar. 2022. URL http://arxiv.org/abs/2103.12407. arXiv:2103.12407 [cs].
- Language Models Trained on Media Diets Can Predict Public Opinion, Mar. 2023. URL http://arxiv.org/abs/2303.16779. arXiv:2303.16779 [cs].
- K. Claessen and J. Hughes. QuickCheck: a lightweight tool for random testing of Haskell programs. In Proceedings of the fifth ACM SIGPLAN international conference on Functional programming, ICFP ’00, pages 268–279, New York, NY, USA, Sept. 2000. Association for Computing Machinery. ISBN 978-1-58113-202-1. doi: 10.1145/351240.351266. URL https://dl.acm.org/doi/10.1145/351240.351266.
- Active learning with statistical models. Journal of artificial intelligence research, 4:129–145, 1996.
- P. Coy. Opinion | Can A.I. and Democracy Fix Each Other? The New York Times, Apr. 2023. ISSN 0362-4331. URL https://www.nytimes.com/2023/04/05/opinion/artificial- intelligence-democracy-chatgpt.html.
- CPI. Building Consensus and Compromise on Uber in Taiwan, Sept. 2019. URL https://www.centreforpublicimpact.org/case-study/building- consensus-compromise-uber-taiwan.
- FEQA: A question answering evaluation framework for faithfulness assessment in abstractive summarization. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5055–5070, Online, July 2020. Association for Computational Linguistics. doi: 10.18653/v1/2020.acl-main.454. URL https://aclanthology.org/2020.acl-main.454.
- Spurious correlations in reference-free evaluation of text generation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1443–1454, Dublin, Ireland, May 2022. Association for Computational Linguistics. doi: 10.18653/v1/2022.acl-long.102. URL https://aclanthology.org/2022.acl-long.102.
- Summeval: Re-evaluating summarization evaluation, 2021.
- A. France-Presse. Romania PM unveils AI ‘adviser’ to tell him what people think in real time. The Guardian, Mar. 2023. ISSN 0261-3077. URL https://www.theguardian.com/world/2023/mar/02/romania-ion- ai-government-honorary-adviser-artificial-intelligence-pm- nicolae-ciuca.
- Predictability and Surprise in Large Generative Models. In 2022 ACM Conference on Fairness, Accountability, and Transparency, pages 1747–1764, June 2022a. doi: 10.1145/3531146.3533229. URL http://arxiv.org/abs/2202.07785. arXiv:2202.07785 [cs].
- Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned, Nov. 2022b. URL http://arxiv.org/abs/2209.07858. arXiv:2209.07858 [cs].
- Z. Ghahramani and T. L. Griffiths. Infinite latent feature models and the Indian buffet process. May 2005.
- T. Goyal and G. Durrett. Annotating and modeling fine-grained factuality in summarization. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1449–1462, Online, June 2021. Association for Computational Linguistics. doi: 10.18653/v1/2021.naacl-main.114. URL https://aclanthology.org/2021.naacl-main.114.
- Not what you’ve signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection, May 2023. URL http://arxiv.org/abs/2302.12173. arXiv:2302.12173 [cs].
- Hierarchical topic models and the nested chinese restaurant process. Advances in neural information processing systems, 16, 2003.
- J. Habermas. The Structural Transformation of the Public Sphere. Polity Press, England, 1962. ISBN 978-0-262-58108-0.
- J. Habermas. The Theory of Communicative Action. 1981. URL https://en.wikipedia.org/w/index.php? title=The_Theory_of_Communicative_Action&oldid=1134581345. Page Version ID: 1134581345.
- S. Harding. The Science Question in Feminism. Cornell University Press, 1986. URL https://www.cornellpress.cornell.edu/book/9780801418808/the- science-question-in-feminism/#bookTabs=1.
- The political ideology of conversational AI: Converging evidence on ChatGPT’s pro-environmental, left-libertarian orientation, Jan. 2023. URL https://papers.ssrn.com/abstract=4316084.
- C. Horton. The simple but ingenious system Taiwan uses to crowdsource its laws. MIT Technology Review, Aug. 2018. URL https://www.technologyreview.com/2018/08/21/240284/the- simple-but-ingenious-system-taiwan-uses-to-crowdsource-its- laws/.
- Social Biases in NLP Models as Barriers for Persons with Disabilities. pages 5491–5501. Association for Computational Linguistics, July 2020. doi: 10.18653/v1/2020.acl-main.487.
- M. S. Jahan and M. Oussalah. A systematic review of Hate Speech automatic detection using Natural Language Processing, May 2021. URL http://arxiv.org/abs/2106.00742. arXiv:2106.00742 [cs].
- S. Jain and B. C. Wallace. Attention is not Explanation, May 2019. URL http://arxiv.org/abs/1902.10186. arXiv:1902.10186 [cs].
- Survey of hallucination in natural language generation. ACM Comput. Surv., 55(12), mar 2023a. ISSN 0360-0300. doi: 10.1145/3571730. URL https://doi.org/10.1145/3571730.
- Survey of Hallucination in Natural Language Generation. ACM Computing Surveys, 55(12):248:1–248:38, Mar. 2023b. ISSN 0360-0300. doi: 10.1145/3571730. URL https://doi.org/10.1145/3571730.
- Elicitation Inference Optimization for Multi-Principal-Agent Alignment. NuerIPS, 2022. URL https://openreview.net/pdf?id=tkxnRPkb_H.
- Evaluating the factual consistency of abstractive text summarization. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 9332–9346, Online, Nov. 2020. Association for Computational Linguistics. doi: 10.18653/v1/2020.emnlp-main.750. URL https://aclanthology.org/2020.emnlp-main.750.
- Measuring Bias in Contextualized Word Representations, June 2019. URL http://arxiv.org/abs/1906.07337. arXiv:1906.07337 [cs].
- G. Lakoff. The Political Mind: A Cognitive Scientist’s Guide to Your Brain and Its Politics. Penguin Books, New York, NY, reprint edition edition, June 2009. ISBN 978-0-14-311568-7.
- Don’t Think of an Elephant!: Know Your Values and Frame the Debate–The Essential Guide for Progressives. Chelsea Green Publishing, White River Junction, Vt, first edition edition, Sept. 2004. ISBN 978-1-931498-71-5.
- Illustrating Reinforcement Learning from Human Feedback (RLHF). Hugging Face Blog, 2022. URL https://huggingface.co/blog/rlhf.
- H. Landemore. Open Democracy: Reinventing Popular Rule for the Twenty-First Century. Princeton University Press, Princeton, Mar. 2022. ISBN 978-0-691-21239-5. URL https://press.princeton.edu/books/hardcover/9780691181998/ open-democracy.
- Evaluating human-language model interaction, 2022.
- Hate speech detection: Challenges and solutions. PLOS ONE, 14(8):e0221152, Aug. 2019. ISSN 1932-6203. doi: 10.1371/journal.pone.0221152. URL https://journals.plos.org/plosone/article?id=10.1371/ journal.pone.0221152. Publisher: Public Library of Science.
- J. B. MacQueen. Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Statistics, volume 5.1, pages 281–298. University of California Press, 1967. URL https://projecteuclid.org/ebooks/berkeley-symposium-on- mathematical-statistics-and-probability/Proceedings-of-the- Fifth-Berkeley-Symposium-on-Mathematical-Statistics-and/ chapter/Some-methods-for-classification-and-analysis-of- multivariate-observations/bsmsp/1200512992.
- On faithfulness and factuality in abstractive summarization. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 1906–1919, Online, July 2020. Association for Computational Linguistics. doi: 10.18653/v1/2020.acl-main.173. URL https://aclanthology.org/2020.acl-main.173.
- J. McKenzie. Testing Tech for Consensus in a Purple Town | Civicist, Mar. 2018. URL https://web.archive.org/web/20210414093745/https:// civichall.org/civicist/testing-tech-consensus-purple-town/. Cached archive.
- C. Miller and M. Anubi. How to fix democracy. URL https://www.bbc.co.uk/programmes/p0crtyww.
- K. P. Murphy. Machine learning: a probabilistic perspective. Adaptive computation and machine learning series. MIT Press, Cambridge, MA, 2012. ISBN 9780262018029.
- K. P. Murphy. Probabilistic Machine Learning: An introduction. MIT Press, 2022. URL probml.ai.
- K. P. Murphy. Probabilistic Machine Learning: Advanced Topics. MIT Press, 2023. URL http://probml.github.io/book2.
- OpenAI. GPT-4 Technical Report, Mar. 2023. URL http://arxiv.org/abs/2303.08774. arXiv:2303.08774 [cs].
- C. Pacheco and M. D. Ernst. Randoop: feedback-directed random testing for Java. In Companion to the 22nd ACM SIGPLAN conference on Object-oriented programming systems and applications companion, OOPSLA ’07, pages 815–816, New York, NY, USA, Oct. 2007. Association for Computing Machinery. ISBN 978-1-59593-865-7. doi: 10.1145/1297846.1297902. URL https://doi.org/10.1145/1297846.1297902.
- Social simulacra: Creating populated prototypes for social computing systems. In In the 35th Annual ACM Symposium on User Interface Software and Technology (UIST ’22), UIST ’22, New York, NY, USA, 2022. Association for Computing Machinery. ISBN 9781450393201. doi: 10.1145/3526113.3545616. URL https://doi.org/10.1145/3526113.3545616.
- K. Pearson. On lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 2(11):559–572, 1901. doi: 10.1080/14786440109462720. URL https://doi.org/10.1080/14786440109462720. Publisher: Taylor & Francis _eprint: https://doi.org/10.1080/14786440109462720.
- F. Perez and I. Ribeiro. Ignore Previous Prompt: Attack Techniques For Language Models, Nov. 2022. URL http://arxiv.org/abs/2211.09527. arXiv:2211.09527 [cs].
- C. E. Rasmussen. Gaussian processes for machine learning. 2006.
- P. Robert and Y. Escoufier. A Unifying Tool for Linear Multivariate Statistical Methods: The RV- Coefficient. Journal of the Royal Statistical Society. Series C (Applied Statistics), 25(3):257–265, 1976. ISSN 0035-9254. doi: 10.2307/2347233. URL https://www.jstor.org/stable/2347233. Publisher: [Wiley, Royal Statistical Society].
- M. B. Rosenberg. Nonviolent communication : a language of life. Encinitas, CA : PuddleDancer Press, 2005. ISBN 978-1-892005-03-8. URL http://archive.org/details/isbn_9781892005038.
- Whose Opinions Do Language Models Reflect?, Mar. 2023. URL http://arxiv.org/abs/2303.17548. arXiv:2303.17548 [cs].
- Social Bias Frames: Reasoning about Social and Power Implications of Language, Apr. 2020. URL http://arxiv.org/abs/1911.03891. arXiv:1911.03891 [cs].
- ChatGPT: Optimizing Language Models for Dialogue, Nov. 2022. URL https://openai.com/blog/chatgpt/.
- D. Sergent. First-ever civic assembly gives residents chance to be heard. Bowling Green Daily News, Feb. 2018. URL https://www.bgdailynews.com/news/first-ever-civic-assembly- gives-residents-chance-to-be-heard/article_0a17254e-a8bb- 5f4f-884f-9d0617ab9c08.html. author email: [email protected].
- Explaining Patterns in Data with Language Models via Interpretable Autoprompting, Jan. 2023. URL http://arxiv.org/abs/2210.01848. arXiv:2210.01848 [cs, q-bio, stat].
- Polis: Scaling Deliberation by Mapping High Dimensional Opinion Spaces. Recerca: Revista de Pensament i Anàlisi, 26(2), 2021. doi: https://doi.org/10.6035/recerca.5516. URL https://www.proquest.com/docview/2610037205. Publisher: Universitat Jaume I Servei de Comunicacio i Publicacions.
- Learning to summarize with human feedback. In H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 3008–3021. Curran Associates, Inc., 2020. URL https://proceedings.neurips.cc/paper_files/paper/2020/file/ 1f89885d556929e98d3ef9b86448f951-Paper.pdf.
- Reinforcement Learning: An Introduction. Cambridge, MA: MIT Press, 1998.
- LLaMA: Open and Efficient Foundation Language Models, Feb. 2023. URL http://arxiv.org/abs/2302.13971. arXiv:2302.13971 [cs].
- J. Vincent. Google plans giant AI language model supporting world’s 1,000 most spoken languages, Nov. 2022. URL https://www.theverge.com/2022/11/2/23434360/google-1000- languages-initiative-ai-llm-research-project.
- Offensive Language and Hate Speech Detection with Deep Learning and Transfer Learning, Aug. 2021. URL http://arxiv.org/abs/2108.03305. arXiv:2108.03305 [cs].
- Ethical and social risks of harm from Language Models, Dec. 2021. URL http://arxiv.org/abs/2112.04359. arXiv:2112.04359 [cs].
- S. Wiegreffe and Y. Pinter. Attention is not not Explanation, Sept. 2019. URL http://arxiv.org/abs/1908.04626. arXiv:1908.04626 [cs].
- Birdwatch: Crowd Wisdom and Bridging Algorithms can Inform Understanding and Reduce the Spread of Misinformation, Oct. 2022. URL https://github.com/twitter/birdwatch/blob/ a7b8c8a1492eb930267f84578e7ebebefe8e8aef/ birdwatch_paper_2022_10_27.pdf.
- Grounding interactive machine learning tool design in how non-experts actually build models. In Proceedings of the 2018 Designing Interactive Systems Conference, DIS ’18, page 573–584, New York, NY, USA, 2018. Association for Computing Machinery. ISBN 9781450351980. doi: 10.1145/3196709.3196729. URL https://doi.org/10.1145/3196709.3196729.
- Wikum: Bridging Discussion Forums and Wikis Using Recursive Summarization. In Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing, CSCW ’17, pages 2082–2096, New York, NY, USA, 2017. ACM. ISBN 978-1-4503-4335-0. doi: 10.1145/2998181.2998235. URL http://doi.acm.org/10.1145/2998181.2998235. event-place: Portland, Oregon, USA.
- Benchmarking Large Language Models for News Summarization, Jan. 2023. URL http://arxiv.org/abs/2301.13848. arXiv:2301.13848 [cs].
- Fine-Tuning Language Models from Human Preferences, Jan. 2020. URL http://arxiv.org/abs/1909.08593. arXiv:1909.08593 [cs, stat].