Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

An Empirical Analysis of Diversity in Argument Summarization (2402.01535v2)

Published 2 Feb 2024 in cs.CL and cs.AI

Abstract: Presenting high-level arguments is a crucial task for fostering participation in online societal discussions. Current argument summarization approaches miss an important facet of this task -- capturing diversity -- which is important for accommodating multiple perspectives. We introduce three aspects of diversity: those of opinions, annotators, and sources. We evaluate approaches to a popular argument summarization task called Key Point Analysis, which shows how these approaches struggle to (1) represent arguments shared by few people, (2) deal with data from various sources, and (3) align with subjectivity in human-provided annotations. We find that both general-purpose LLMs and dedicated KPA models exhibit this behavior, but have complementary strengths. Further, we observe that diversification of training data may ameliorate generalization. Addressing diversity in argument summarization requires a mix of strategies to deal with subjectivity.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (72)
  1. Key point analysis via contrastive learning and extractive argument summarization. In Proceedings of the 8th Workshop on Argument Mining, pages 184–189.
  2. Generating contrastive snippets for argument search. In Computational Models of Argument, pages 21–31. IOS Press.
  3. Aspect-controllable opinion summarization. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 6578–6593.
  4. Extractive opinion summarization in quantized transformer spaces. Transactions of the Association for Computational Linguistics, 9:277–293.
  5. Stefanos Angelidis and Mirella Lapata. 2018. Summarizing opinions: Aspect extraction meets sentiment prediction and they are both weakly supervised. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 3675–3686, Brussels, Belgium. Association for Computational Linguistics.
  6. Citizen participation and machine learning for a better democracy. Digital Government: Research and Practice, 2(3):1–22.
  7. Show me the money! deriving the pricing power of product features by mining consumer reviews. In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 56–65.
  8. Leveraging ai for democratic discourse: Chat interventions can improve online political conversations at scale. Proceedings of the National Academy of Sciences, 120(41):e2311627120.
  9. Lora Aroyo and Chris Welty. 2015. Truth is a lie: Crowd truth and the seven myths of human annotation. AI Magazine, 36(1):15–24.
  10. From arguments to key points: Towards automatic argument summarization. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics.
  11. Every bite is an experience: Key point analysis of business reviews. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 3376–3386.
  12. Quantitative argument summarization and beyond: Cross-domain key point analysis. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 39–49.
  13. Project Debater APIs: Decomposing the AI grand challenge. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 267–274, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
  14. Of course it’s political! a critical inquiry into underemphasized dimensions in civic text visualization. In Computer Graphics Forum, 3, pages 1–14. Wiley Online Library.
  15. How (not) to use sociodemographic information for subjective NLP tasks. arXiv preprint arXiv:2309.07034.
  16. Surbhi Bhatia. 2020. A comparative study of opinion summarization techniques. IEEE Transactions on Computational Social Systems, 8(1):110–117.
  17. Language (technology) is power: A critical survey of “bias” in nlp. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5454–5476.
  18. Toward a perspectivist turn in ground truthing for predictive computing. Proceedings of the AAAI Conference on Artificial Intelligence, 37(6):6860–6868.
  19. Paul Cairney. 2016. The politics of evidence-based policy making. Springer.
  20. From key points to key point hierarchy: Structured and expressive opinion summarization. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 912–928, Toronto, Canada. Association for Computational Linguistics.
  21. Seeing things from a different angle: Discovering diverse perspectives about claims. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 542–557, Minneapolis, Minnesota. Association for Computational Linguistics.
  22. Eric Chu and Peter Liu. 2019. Meansum: A neural model for unsupervised multi-document abstractive summarization. In International Conference on Machine Learning, pages 1223–1232. PMLR.
  23. Rethinking embedding coupling in pre-trained language models. In International Conference on Learning Representations.
  24. Measuring text reuse. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pages 152–159.
  25. Dealing with disagreements: Looking beyond the majority vote in subjective annotations. Transactions of the Association for Computational Linguistics, 10:92–110.
  26. Summarising the points made in online political debates. In Proceedings of the 3rd Workshop on Argument Mining, The 54th Annual Meeting of the Association for Computational Linguistics, pages 134–143. Association for Computational Linguistics (ACL).
  27. Automatic text summarization: A comprehensive survey. Expert systems with applications, 165:113679.
  28. Overview of the 2021 key point analysis shared task. In Conference on Empirical Methods in Natural Language Processing.
  29. Albert Gatt and Emiel Krahmer. 2018. Survey of the state of the art in natural language generation: Core tasks, applications and evaluation. Journal of Artificial Intelligence Research, 61:65–170.
  30. Benchmark data and evaluation framework for intent discovery around covid-19 vaccine hesitancy. In Findings of the Association for Computational Linguistics: EACL 2023, pages 1328–1340.
  31. Max Grusky. 2023. Rogue scores. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1914–1934.
  32. Mediating debate through on-line large-scale argumentation: Evidence from the field. Information Sciences, 180(19):3686–3702.
  33. Dirk Hovy and Shrimai Prabhumoye. 2021. Five sources of bias in natural language processing. Language and Linguistics Compass, 15(8):e12432.
  34. Large-scale, diverse, paraphrastic bitexts via sampling and clustering. In Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL), pages 44–54, Hong Kong, China. Association for Computational Linguistics.
  35. Examining bias in opinion summarisation through the perspective of opinion diversity. In Proceedings of the 13th Workshop on Computational Approaches to Subjectivity, Sentiment, & Social Media Analysis, pages 149–161, Toronto, Canada. Association for Computational Linguistics.
  36. IBM Research. 2023. The Project Debater Service API. https://developer.ibm.com/apis/catalog/debater--project-debater-service-api/Introduction. Accessed: October 2023.
  37. David Inouye and Jugal K Kalita. 2011. Comparing twitter summarization algorithms for multiple post summaries. In 2011 IEEE Third international conference on privacy, security, risk and trust and 2011 IEEE third international conference on social computing, pages 298–306. IEEE.
  38. Large language models struggle to learn long-tail knowledge. In International Conference on Machine Learning, pages 15696–15707. PMLR.
  39. Mark Klein. 2012. Enabling large-scale deliberation using attention-mediation metrics. Computer Supported Cooperative Work (CSCW), 21:449–473.
  40. John Lawrence and Chris Reed. 2020. Argument mining: A survey. Computational Linguistics, 45(4):765–818.
  41. Do you hear the people sing? key point analysis via iterative clustering and abstractive summarisation. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 14064–14080, Toronto, Canada. Association for Computational Linguistics.
  42. Value inference in sociotechnical systems: Blue sky ideas track. In Proceedings of the 22nd International Conference on Autonomous Agents and Multiagent Systems, AAMAS, volume 23, pages 1–7.
  43. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
  44. Results of the WMT19 metrics shared task: Segment-level and strong MT systems pose big challenges. In Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1), pages 62–90, Florence, Italy. Association for Computational Linguistics.
  45. A systemic approach to deliberative democracy. Deliberative systems: Deliberative democracy at the large scale, pages 1–26.
  46. Public conceptions of justice in climate engineering: Evidence from secondary analysis of public deliberation. Global Environmental Change, 41:64–73.
  47. Using summarization to discover argument facets in online idealogical dialog. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 430–440.
  48. Public participation in crisis policymaking. how 30,000 dutch citizens advised their government on relaxing covid-19 lockdown measures. PLoS One, 16(5):e0250614.
  49. Vocal minority versus silent majority: Discovering the opionions of the long tail. In 2011 IEEE Third International Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third International Conference on Social Computing, pages 103–110. IEEE.
  50. Yoon-Eui Nahm. 2013. A novel approach to prioritize customer requirements in qfd based on customer satisfaction function for customer-oriented product design. Journal of Mechanical Science and Technology, 27:3765–3777.
  51. Automatic summarization. Foundations and Trends® in Information Retrieval, 5(2–3):103–233.
  52. OpenAI. 2023. The OpenAI Python library. https://github.com/openai/openai-python. Accessed: October 2023.
  53. Matching the statements: A simple and accurate model for key point analysis. In Proceedings of the 8th Workshop on Argument Mining, pages 165–174.
  54. Barbara Plank. 2022. The “problem” of human label variation: On ground truth in data, modeling and evaluation. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 10671–10682.
  55. Julia Romberg. 2022. Is your perspective also my perspective? enriching prediction with subjectivity. In Proceedings of the 9th Workshop on Argument Mining, pages 115–125.
  56. Arguments to key points mapping with prompt-based learning. ICNLSP 2022, page 303.
  57. Correlation coefficients: appropriate use and interpretation. Anesthesia & analgesia, 126(5):1763–1768.
  58. Bleurt: Learning robust metrics for text generation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7881–7892.
  59. Reason against the machine? future directions for mass online deliberation. Frontiers in Political Science, 4:946589.
  60. Aspect-sentiment-based opinion summarization using multiple information sources. In Proceedings of the 6th Joint International Conference on Data Science & Management of Data (10th ACM IKDD CODS and 28th COMAD), pages 55–61.
  61. Julius Sim and Chris C Wright. 2005. The kappa statistic in reliability studies: use, interpretation, and sample size requirements. Physical therapy, 85(3):257–268.
  62. Opiniondigest: A simple framework for opinion summarization. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5789–5798.
  63. James Surowiecki. 2005. The wisdom of crowds. Anchor.
  64. Generating informative conclusions for argumentative texts. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 3482–3493.
  65. HyEnA: A hybrid method for extracting arguments from opinions. In HHAI2022: Augmenting Human Intellect, pages 17–31. IOS Press.
  66. Do differences in values influence disagreements in online discussions? In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 15986–16008, Singapore. Association for Computational Linguistics.
  67. An empirical analysis of diversity in argument summarization: Supplementary material.
  68. Towards argument mining for social good: A survey. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 1338–1352.
  69. Neural network-based abstract generation for opinions and arguments. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 47–57, San Diego, California. Association for Computational Linguistics.
  70. Detecting minority arguments for mutual understanding: A moderation tool for the online climate change debate. In Proceedings of the 29th International Conference on Computational Linguistics, pages 6715–6725.
  71. Bartscore: Evaluating generated text as text generation. Advances in Neural Information Processing Systems, 34:27263–27277.
  72. Bertscore: Evaluating text generation with bert. In International Conference on Learning Representations.
Citations (6)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets