Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

An Artificial Neuron for Enhanced Problem Solving in Large Language Models (2404.14222v1)

Published 22 Apr 2024 in cs.HC

Abstract: Recent advancements in artificial intelligence have propelled the capabilities of LLMs, yet their ability to mimic nuanced human reasoning remains limited. This paper introduces a novel conceptual enhancement to LLMs, termed the Artificial Neuron, designed to significantly bolster cognitive processing by integrating external memory systems. This enhancement mimics neurobiological processes, facilitating advanced reasoning and learning through a dynamic feedback loop mechanism. We propose a unique framework wherein each LLM interaction specifically in solving complex math word problems and common sense reasoning tasks is recorded and analyzed. Incorrect responses are refined using a higher capacity LLM or human in the loop corrections, and both the query and the enhanced response are stored in a vector database, structured much like neuronal synaptic connections. This Artificial Neuron thus serves as an external memory aid, allowing the LLM to reference past interactions and apply learned reasoning strategies to new problems. Our experimental setup involves training with the GSM8K dataset for initial model response generation, followed by systematic refinements through feedback loops. Subsequent testing demonstrated a significant improvement in accuracy and efficiency, underscoring the potential of external memory systems to advance LLMs beyond current limitations. This approach not only enhances the LLM's problem solving precision but also reduces computational redundancy, paving the way for more sophisticated applications of artificial intelligence in cognitive tasks. This paper details the methodology, implementation, and implications of the Artificial Neuron model, offering a transformative perspective on enhancing machine intelligence.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (38)
  1. OpenAI. Gpt-4 technical report. arxiv 2303.08774. View in Article, 2:13, 2023.
  2. Sparks of artificial general intelligence: Early experiments with gpt-4. arXiv preprint arXiv:2303.12712, 2023.
  3. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
  4. Emergent abilities of large language models. arXiv preprint arXiv:2206.07682, 2022.
  5. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, 2023.
  6. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
  7. Gorilla: Large language model connected with massive apis. arXiv preprint arXiv:2305.15334, 2023.
  8. React: Synergizing reasoning and acting in language models. arXiv preprint arXiv:2210.03629, 2022.
  9. Camel: Communicative agents for" mind" exploration of large scale language model society. arXiv preprint arXiv:2303.17760, 2023.
  10. Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374, 2021.
  11. Training language models to follow instructions with human feedback, 2022. URL https://arxiv. org/abs/2203.02155, 13, 2022.
  12. Explain yourself! leveraging language models for commonsense reasoning. arXiv preprint arXiv:1906.02361, 2019.
  13. Using simulations to teach negotiation: Pedagogical theory and practice. Teaching negotiation: Ideas and innovations, pages 285–310, 2000.
  14. I cast detect thoughts: Learning to converse and guide with intents and theory-of-mind in dungeons and dragons. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 11136–11155, 2023.
  15. A synergistic core for human brain evolution and cognition. Nature Neuroscience, 25(6):771–782, 2022.
  16. Program induction by rationale generation: Learning to solve and explain algebraic word problems. arXiv preprint arXiv:1705.04146, 2017.
  17. Training verifiers to solve math word problems. arXiv preprint arXiv:2110.14168, 2021.
  18. Solving general arithmetic word problems. arXiv preprint arXiv:1608.01413, 2016.
  19. Semantically-aligned equation generation for solving and reasoning math word problems. arXiv preprint arXiv:1811.00720, 2018.
  20. Mathqa: Towards interpretable math word problem solving with operation-based formalisms. arXiv preprint arXiv:1905.13319, 2019.
  21. Neural symbolic reader: Scalable integration of distributed and symbolic representations for reading comprehension. In International Conference on Learning Representations, 2019.
  22. Commonsenseqa: A question answering challenge targeting commonsense knowledge. arXiv preprint arXiv:1811.00937, 2018.
  23. Socialiqa: Commonsense reasoning about social interactions. arXiv preprint arXiv:1904.09728, 2019.
  24. Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191, 2021.
  25. Can large language models be an alternative to human evaluations? arXiv preprint arXiv:2305.01937, 2023.
  26. Navigating complexity: Orchestrated problem solving with multi-agent llms, 2024.
  27. Faithfulness vs. plausibility: On the (un)reliability of explanations from large language models. ArXiv, abs/2402.04614, 2024.
  28. Beyond segmentation: Road network generation with multi-modal llms. arXiv preprint arXiv:2310.09755, 2023.
  29. Encouraging divergent thinking in large language models through multi-agent debate. arXiv preprint arXiv:2305.19118, 2023.
  30. Communicative agents for software development. arXiv preprint arXiv:2307.07924, 2023.
  31. Evidence for a collective intelligence factor in the performance of human groups. science, 330(6004):686–688, 2010.
  32. Multi-agent communication meets natural language: Synergies between functional and structural language learning, 2020.
  33. Emergent linguistic phenomena in multi-agent communication games, 2020.
  34. Emergent translation in multi-agent communication, 2018.
  35. Large language models are diverse role-players for summarization evaluation. arXiv preprint arXiv:2303.15078, 2023.
  36. Sumedh Rasal. Llm harmony: Multi-agent communication for problem solving, 2024.
  37. Carlos Jose Xavier Cruz. Transforming competition into collaboration: The revolutionary role of multi-agent systems and language models in modern organizations, 2024.
  38. Llm based multi-agent generation of semi-structured documents from semantic templates in the public administration domain. ArXiv, abs/2402.14871, 2024.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (1)
  1. Sumedh Rasal (6 papers)
X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

Youtube Logo Streamline Icon: https://streamlinehq.com