Can formal argumentative reasoning enhance LLMs performances? (2405.13036v1)
Abstract: Recent years witnessed significant performance advancements in deep-learning-driven natural LLMs, with a strong focus on the development and release of LLMs. These improvements resulted in better quality AI-generated output but rely on resource-expensive training and upgrading of models. Although different studies have proposed a range of techniques to enhance LLMs without retraining, none have considered computational argumentation as an option. This is a missed opportunity since computational argumentation is an intuitive mechanism that formally captures agents' interactions and the information conflict that may arise during such interplays, and so it seems well-suited for boosting the reasoning and conversational abilities of LLMs in a seamless manner. In this paper, we present a pipeline (MQArgEng) and preliminary study to evaluate the effect of introducing computational argumentation semantics on the performance of LLMs. Our experiment's goal was to provide a proof-of-concept and a feasibility analysis in order to foster (or deter) future research towards a fully-fledged argumentation engine plugin for LLMs. Exploratory results using the MT-Bench indicate that MQArgEng provides a moderate performance gain in most of the examined topical categories and, as such, show promise and warrant further research.
- Gqa: Training generalized multi-query transformer models from multi-head checkpoints. arXiv preprint arXiv:2305.13245, 2023.
- Anthropic. The Claude 3 model family: Opus, Sonnet, Haiku. 2024.
- A multitask, multilingual, multimodal evaluation of chatgpt on reasoning, hallucination, and interactivity. arXiv preprint arXiv:2302.04023, 2023.
- Longformer: The long-document transformer. arXiv preprint arXiv:2004.05150, 2020.
- Argumentation in artificial intelligence. Artificial intelligence, 171(10-15):619–641, 2007.
- Graph of thoughts: Solving elaborate problems with large language models. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 17682–17690, 2024.
- D. Bryant and P. Krause. A review of current defeasible reasoning implementations. The Knowledge Engineering Review, 23(3):227–260, 2008.
- Computational argumentation-based Chatbots: a survey, 2024a.
- Explanation–Question–Response dialogue: An argumentative tool for explainable AI. Argument & Computation, (Preprint):1–23, 2024b.
- A survey on evaluation of large language models. arXiv preprint arXiv:2307.03109, 2023.
- Contrastive chain-of-thought prompting. arXiv preprint arXiv:2311.09277, 2023.
- P. M. Dung. On the acceptability of arguments and its fundamental role in nonmonotonic reasoning, logic programming and n-person games. Artificial intelligence, 77(2):321–357, 1995.
- Aspartix-v19-an answer-set programming based system for abstract argumentation. In International Symposium on Foundations of Information and Knowledge Systems, pages 79–89. Springer, 2020.
- Aspartix: Implementing argumentation frameworks using answer-set programming. In International Conference on Logic Programming, pages 734–738. Springer, 2008.
- H. Face. Quantization. https://huggingface.co/docs/optimum/en/concept_guides/quantization, (last accessed 19/04/2024).
- Mathematical capabilities of chatgpt. arXiv preprint arXiv:2301.13867, 2023.
- A. Gu and T. Dao. Mamba: Linear-time sequence modeling with selective state spaces, 2023.
- K. Hammond and D. Leake. Large language models need symbolic ai. 2023.
- Quantization and training of neural networks for efficient integer-arithmetic-only inference. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2704–2713, 2018.
- Mistral 7B, 2023.
- Large language models are zero-shot reasoners. Advances in neural information processing systems, 35:22199–22213, 2022.
- M. Kosinski. Theory of mind may have spontaneously emerged in large language models. arXiv preprint arXiv:2302.02083, 2023.
- Mind’s eye: Grounded language model reasoning through simulation. arXiv preprint arXiv:2210.05359, 2022.
- Dissociating language and thought in large language models: a cognitive perspective. arXiv preprint arXiv:2301.06627, 2023.
- H. Mercier and D. Sperber. Why do humans reason? Arguments for an argumentative theory. In Behavioral and brain sciences, volume 34, pages 57–74. Cambridge University Press, 2011.
- Meta. Introducing Meta Llama 3: The most capable openly available LLM to date. 2024. https://ai.meta.com/blog/meta-llama-3/, (last accessed 19/04/2024).
- K. Mok. The Rise of Small Language Models. 2024. https://thenewstack.io/the-rise-of-small-language-models/ (last accessed 25/04/2024).
- OpenAI. GPT-4 technical report, 2023.
- Are emergent abilities of large language models a mirage? arXiv preprint arXiv:2304.15004, 2023.
- H. H. Thorp. ChatGPT is fun, but not an author. Science, 379(6630):313–313, 2023. 10.1126/science.adg7879. URL https://www.science.org/doi/abs/10.1126/science.adg7879.
- Mujoco: A physics engine for model-based control. In 2012 IEEE/RSJ international conference on intelligent robots and systems, pages 5026–5033. IEEE, 2012.
- Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, 2023a.
- Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, 2023b.
- Attention is all you need. Advances in neural information processing systems, 30, 2017.
- Self-consistency improves chain of thought reasoning in language models, 2023.
- Emergent abilities of large language models. arXiv preprint arXiv:2206.07682, 2022a.
- Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35:24824–24837, 2022b.
- xAI. Open Release of Grok-1. 2024. https://x.ai/blog/grok-os, (last accessed 21/04/2024).
- Tree of thoughts: Deliberate problem solving with large language models. arXiv preprint arXiv:2305.10601, 2023.
- A survey of large language models. arXiv preprint arXiv:2303.18223, 2023.
- Take a step back: Evoking reasoning via abstraction in large language models. arXiv preprint arXiv:2310.06117, 2023a.
- Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena, 2023b.
- Federico Castagna (8 papers)
- Isabel Sassoon (3 papers)
- Simon Parsons (29 papers)