Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Can formal argumentative reasoning enhance LLMs performances? (2405.13036v1)

Published 16 May 2024 in cs.CL and cs.AI

Abstract: Recent years witnessed significant performance advancements in deep-learning-driven natural LLMs, with a strong focus on the development and release of LLMs. These improvements resulted in better quality AI-generated output but rely on resource-expensive training and upgrading of models. Although different studies have proposed a range of techniques to enhance LLMs without retraining, none have considered computational argumentation as an option. This is a missed opportunity since computational argumentation is an intuitive mechanism that formally captures agents' interactions and the information conflict that may arise during such interplays, and so it seems well-suited for boosting the reasoning and conversational abilities of LLMs in a seamless manner. In this paper, we present a pipeline (MQArgEng) and preliminary study to evaluate the effect of introducing computational argumentation semantics on the performance of LLMs. Our experiment's goal was to provide a proof-of-concept and a feasibility analysis in order to foster (or deter) future research towards a fully-fledged argumentation engine plugin for LLMs. Exploratory results using the MT-Bench indicate that MQArgEng provides a moderate performance gain in most of the examined topical categories and, as such, show promise and warrant further research.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (42)
  1. Gqa: Training generalized multi-query transformer models from multi-head checkpoints. arXiv preprint arXiv:2305.13245, 2023.
  2. Anthropic. The Claude 3 model family: Opus, Sonnet, Haiku. 2024.
  3. A multitask, multilingual, multimodal evaluation of chatgpt on reasoning, hallucination, and interactivity. arXiv preprint arXiv:2302.04023, 2023.
  4. Longformer: The long-document transformer. arXiv preprint arXiv:2004.05150, 2020.
  5. Argumentation in artificial intelligence. Artificial intelligence, 171(10-15):619–641, 2007.
  6. Graph of thoughts: Solving elaborate problems with large language models. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 17682–17690, 2024.
  7. D. Bryant and P. Krause. A review of current defeasible reasoning implementations. The Knowledge Engineering Review, 23(3):227–260, 2008.
  8. Computational argumentation-based Chatbots: a survey, 2024a.
  9. Explanation–Question–Response dialogue: An argumentative tool for explainable AI. Argument & Computation, (Preprint):1–23, 2024b.
  10. A survey on evaluation of large language models. arXiv preprint arXiv:2307.03109, 2023.
  11. Contrastive chain-of-thought prompting. arXiv preprint arXiv:2311.09277, 2023.
  12. P. M. Dung. On the acceptability of arguments and its fundamental role in nonmonotonic reasoning, logic programming and n-person games. Artificial intelligence, 77(2):321–357, 1995.
  13. Aspartix-v19-an answer-set programming based system for abstract argumentation. In International Symposium on Foundations of Information and Knowledge Systems, pages 79–89. Springer, 2020.
  14. Aspartix: Implementing argumentation frameworks using answer-set programming. In International Conference on Logic Programming, pages 734–738. Springer, 2008.
  15. H. Face. Quantization. https://huggingface.co/docs/optimum/en/concept_guides/quantization, (last accessed 19/04/2024).
  16. Mathematical capabilities of chatgpt. arXiv preprint arXiv:2301.13867, 2023.
  17. A. Gu and T. Dao. Mamba: Linear-time sequence modeling with selective state spaces, 2023.
  18. K. Hammond and D. Leake. Large language models need symbolic ai. 2023.
  19. Quantization and training of neural networks for efficient integer-arithmetic-only inference. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2704–2713, 2018.
  20. Mistral 7B, 2023.
  21. Large language models are zero-shot reasoners. Advances in neural information processing systems, 35:22199–22213, 2022.
  22. M. Kosinski. Theory of mind may have spontaneously emerged in large language models. arXiv preprint arXiv:2302.02083, 2023.
  23. Mind’s eye: Grounded language model reasoning through simulation. arXiv preprint arXiv:2210.05359, 2022.
  24. Dissociating language and thought in large language models: a cognitive perspective. arXiv preprint arXiv:2301.06627, 2023.
  25. H. Mercier and D. Sperber. Why do humans reason? Arguments for an argumentative theory. In Behavioral and brain sciences, volume 34, pages 57–74. Cambridge University Press, 2011.
  26. Meta. Introducing Meta Llama 3: The most capable openly available LLM to date. 2024. https://ai.meta.com/blog/meta-llama-3/, (last accessed 19/04/2024).
  27. K. Mok. The Rise of Small Language Models. 2024. https://thenewstack.io/the-rise-of-small-language-models/ (last accessed 25/04/2024).
  28. OpenAI. GPT-4 technical report, 2023.
  29. Are emergent abilities of large language models a mirage? arXiv preprint arXiv:2304.15004, 2023.
  30. H. H. Thorp. ChatGPT is fun, but not an author. Science, 379(6630):313–313, 2023. 10.1126/science.adg7879. URL https://www.science.org/doi/abs/10.1126/science.adg7879.
  31. Mujoco: A physics engine for model-based control. In 2012 IEEE/RSJ international conference on intelligent robots and systems, pages 5026–5033. IEEE, 2012.
  32. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, 2023a.
  33. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, 2023b.
  34. Attention is all you need. Advances in neural information processing systems, 30, 2017.
  35. Self-consistency improves chain of thought reasoning in language models, 2023.
  36. Emergent abilities of large language models. arXiv preprint arXiv:2206.07682, 2022a.
  37. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35:24824–24837, 2022b.
  38. xAI. Open Release of Grok-1. 2024. https://x.ai/blog/grok-os, (last accessed 21/04/2024).
  39. Tree of thoughts: Deliberate problem solving with large language models. arXiv preprint arXiv:2305.10601, 2023.
  40. A survey of large language models. arXiv preprint arXiv:2303.18223, 2023.
  41. Take a step back: Evoking reasoning via abstraction in large language models. arXiv preprint arXiv:2310.06117, 2023a.
  42. Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena, 2023b.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Federico Castagna (8 papers)
  2. Isabel Sassoon (3 papers)
  3. Simon Parsons (29 papers)
Citations (2)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets