Papers
Topics
Authors
Recent
2000 character limit reached

Actor Identification in Discourse: A Challenge for LLMs? (2402.00620v1)

Published 1 Feb 2024 in cs.CL

Abstract: The identification of political actors who put forward claims in public debate is a crucial step in the construction of discourse networks, which are helpful to analyze societal debates. Actor identification is, however, rather challenging: Often, the locally mentioned speaker of a claim is only a pronoun ("He proposed that [claim]"), so recovering the canonical actor name requires discourse understanding. We compare a traditional pipeline of dedicated NLP components (similar to those applied to the related task of coreference) with a LLM, which appears a good match for this generation task. Evaluating on a corpus of German actors in newspaper reports, we find surprisingly that the LLM performs worse. Further analysis reveals that the LLM is very good at identifying the right reference, but struggles to generate the correct canonical form. This points to an underlying issue in LLMs with controlling generated output. Indeed, a hybrid model combining the LLM with a classifier to normalize its output substantially outperforms both initial models.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (22)
  1. Between welcome culture and border fence: The European refugee crisis in German newspaper reports. Language Resources and Evaluation 57:121–153. https://doi.org/10.1007/s10579-023-09641-8.
  2. Language models are few-shot learners. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems. volume 33, pages 1877–1901. https://proceedings.neurips.cc/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf.
  3. How do languages influence each other? studying cross-lingual data sharing during llm fine-tuning.
  4. Unsupervised cross-lingual representation learning at scale. In Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel Tetreault, editors, Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, pages 8440–8451. https://doi.org/10.18653/v1/2020.acl-main.747.
  5. Ruud Koopmans and Paul Statham. 1999. Political Claims Analysis: Integrating Protest Event And Political Discourse Approaches. Mobilization 4(2):203–221.
  6. Philip Leifeld. 2016. Discourse Network Analysis: Policy debates as dynamic networks. In The Oxford Handbook of Political Networks, Oxford University Press.
  7. Philip Leifeld and Sebastian Haunss. 2012. Political discourse networks and the conflict over software patents in Europe. European Journal of Political Research 51(3):382–409.
  8. Retrieval-augmented generation for knowledge-intensive NLP tasks. In Proceedings of the 34th Conference on Neural Information Processing Systems. Vancouver, Canada.
  9. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Comput. Surv. 55(9). https://doi.org/10.1145/3560815.
  10. Fantastically ordered prompts and where to find them: Overcoming few-shot prompt order sensitivity. In Smaranda Muresan, Preslav Nakov, and Aline Villavicencio, editors, Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Dublin, Ireland, pages 8086–8098. https://doi.org/10.18653/v1/2022.acl-long.556.
  11. Active learning principles for in-context learning with large language models. In Houda Bouamor, Juan Pino, and Kalika Bali, editors, Findings of the Association for Computational Linguistics: EMNLP 2023. Association for Computational Linguistics, Singapore, pages 5011–5034. https://doi.org/10.18653/v1/2023.findings-emnlp.334.
  12. Who sides with whom? Towards computational construction of discourse networks for political debates. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, pages 2841–2847. https://doi.org/10.18653/v1/P19-1273.
  13. Constraining linear-chain CRFs to regular languages. In Proceedings of the International Conference on Learning Representations. https://openreview.net/forum?id=jbrgwbv8nD.
  14. Solving hard coreference problems. In Rada Mihalcea, Joyce Chai, and Anoop Sarkar, editors, Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Denver, Colorado, pages 809–819. https://doi.org/10.3115/v1/N15-1082.
  15. A recipe for arbitrary text style transfer with large language models. In Smaranda Muresan, Preslav Nakov, and Aline Villavicencio, editors, Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, Dublin, Ireland, pages 837–848. https://doi.org/10.18653/v1/2022.acl-short.94.
  16. Neural entity linking: A survey of models based on deep learning. Semantic Web 13(3).
  17. Hierarchical structured model for fine-to-coarse manifesto text ana lysis. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Association for Computational Linguistics, New Orleans, Louisiana, pages 1964–1974. https://doi.org/10.18653/v1/N18-1178.
  18. Evaluating large language models on controlled generation tasks. In Houda Bouamor, Juan Pino, and Kalika Bali, editors, Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Singapore, pages 3155–3168. https://doi.org/10.18653/v1/2023.emnlp-main.190.
  19. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 .
  20. Albert Webson and Ellie Pavlick. 2022. Do prompt-based models really understand the meaning of their prompts? In Marine Carpuat, Marie-Catherine de Marneffe, and Ivan Vladimir Meza Ruiz, editors, Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Seattle, United States, pages 2300–2344. https://doi.org/10.18653/v1/2022.naacl-main.167.
  21. An invariant learning characterization of controlled text generation. In Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki, editors, Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Toronto, Canada, pages 3186–3206. https://doi.org/10.18653/v1/2023.acl-long.179.
  22. Instruction-following evaluation for large language models. https://doi.org/10.48550/arXiv.2311.07911.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.