Argue with Me Tersely: Towards Sentence-Level Counter-Argument Generation (2312.13608v1)
Abstract: Counter-argument generation -- a captivating area in computational linguistics -- seeks to craft statements that offer opposing views. While most research has ventured into paragraph-level generation, sentence-level counter-argument generation beckons with its unique constraints and brevity-focused challenges. Furthermore, the diverse nature of counter-arguments poses challenges for evaluating model performance solely based on n-gram-based metrics. In this paper, we present the ArgTersely benchmark for sentence-level counter-argument generation, drawing from a manually annotated dataset from the ChangeMyView debate forum. We also propose Arg-LlaMA for generating high-quality counter-argument. For better evaluation, we trained a BERT-based evaluator Arg-Judge with human preference data. We conducted comparative experiments involving various baselines such as LlaMA, Alpaca, GPT-3, and others. The results show the competitiveness of our proposed framework and evaluator in counter-argument generation tasks. Code and data are available at https://github.com/amazingljy1206/ArgTersely.
- Counter-argument generation by attacking weak premises. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 1816–1827, Online. Association for Computational Linguistics.
- Milad Alshomary and Henning Wachsmuth. 2023. Conclusion-based counter-argument generation. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 957–967, Dubrovnik, Croatia. Association for Computational Linguistics.
- High quality real-time structured debate generation. arXiv preprint arXiv:2012.00209.
- Language models are few-shot learners. In Proceedings of the 34th International Conference on Neural Information Processing Systems, NIPS’20, Red Hook, NY, USA. Curran Associates Inc.
- Sparks of artificial general intelligence: Early experiments with gpt-4. arXiv preprint arXiv:2303.12712.
- Longlora: Efficient fine-tuning of long-context large language models. arXiv preprint arXiv:2309.12307.
- Palm: Scaling language modeling with pathways. Journal of Machine Learning Research, 24(240):1–113.
- T. Edward Damer. 2009. Attacking Faulty Reasoning: A Practical Guide to Fallacy-Free Arguments. Belmont, CA: Wadsworth/Cengage Laerning.
- Qlora: Efficient finetuning of quantized llms. arXiv preprint arXiv:2305.14314.
- BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics.
- Instruction induction: From few examples to natural language task descriptions. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1935–1952, Toronto, Canada. Association for Computational Linguistics.
- Parameter-efficient transfer learning for nlp. In International Conference on Machine Learning, pages 2790–2799. PMLR.
- Lora: Low-rank adaptation of large language models. In International Conference on Learning Representations.
- Argument generation with retrieval, planning, and realization. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 2661–2672, Florence, Italy. Association for Computational Linguistics.
- Neural argument generation augmented with externally retrieved evidence. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 219–230, Melbourne, Australia. Association for Computational Linguistics.
- Discrete argument representation learning for interactive argument pair identification. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 5467–5478, Online. Association for Computational Linguistics.
- Christopher Kee. 2006. The art of argument: a guide to mooting, volume 196. Cambridge University Press.
- Samuel Kotz and Norman L Johnson. 2012. Breakthroughs in Statistics: Methodology and distribution. Springer Science & Business Media.
- Alon Lavie and Abhaya Agarwal. 2007. METEOR: An automatic metric for MT evaluation with high levels of correlation with human judgments. In Proceedings of the Second Workshop on Statistical Machine Translation, pages 228–231, Prague, Czech Republic. Association for Computational Linguistics.
- BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7871–7880, Online. Association for Computational Linguistics.
- Xiang Lisa Li and Percy Liang. 2021. Prefix-tuning: Optimizing continuous prompts for generation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 4582–4597, Online. Association for Computational Linguistics.
- Chin-Yew Lin. 2004. ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out, pages 74–81, Barcelona, Spain. Association for Computational Linguistics.
- Ilya Loshchilov and Frank Hutter. 2018. Decoupled weight decay regularization. In International Conference on Learning Representations.
- Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pages 311–318, Philadelphia, Pennsylvania, USA. Association for Computational Linguistics.
- Language models are unsupervised multitask learners. OpenAI blog, 1(8):9.
- Aspect-controlled neural argument generation. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 380–396, Online. Association for Computational Linguistics.
- AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 4222–4235, Online. Association for Computational Linguistics.
- Explaining patterns in data with language models via interpretable autoprompting. arXiv preprint arXiv:2210.01848.
- Cross-topic argument mining from heterogeneous sources. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 3664–3674, Brussels, Belgium. Association for Computational Linguistics.
- Winning arguments: Interaction dynamics and persuasion strategies in good-faith online discussions. In Proceedings of the 25th international conference on world wide web, pages 613–624.
- Stanford alpaca: An instruction-following llama model. https://github.com/tatsu-lab/stanford_alpaca.
- Stephen E Toulmin. 2003. The uses of argument. Cambridge university press.
- Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971.
- Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
- Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, page 6000–6010, Red Hook, NY, USA. Curran Associates Inc.
- Self-instruct: Aligning language models with self-generated instructions. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 13484–13508, Toronto, Canada. Association for Computational Linguistics.
- Finetuned language models are zero-shot learners. In International Conference on Learning Representations.
- Chain-of-thought prompting elicits reasoning in large language models. In Advances in Neural Information Processing Systems.
- Guess the instruction! flipped learning makes language models stronger zero-shot learners. In The Eleventh International Conference on Learning Representations.
- Leveraging argumentation knowledge graph for interactive argument pair identification. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 2310–2319, Online. Association for Computational Linguistics.
- DIALOGPT : Large-scale generative pre-training for conversational response generation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 270–278, Online. Association for Computational Linguistics.
- Large language models are human-level prompt engineers. In International Conference on Learning Representations.
- Jiayu Lin (9 papers)
- Rong Ye (20 papers)
- Meng Han (59 papers)
- Qi Zhang (784 papers)
- Ruofei Lai (13 papers)
- Xinyu Zhang (296 papers)
- Zhao Cao (36 papers)
- Xuanjing Huang (287 papers)
- Zhongyu Wei (98 papers)