Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
175 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Neural Rewriting System to Solve Algorithmic Problems (2402.17407v2)

Published 27 Feb 2024 in cs.NE, cs.AI, and cs.CL

Abstract: Modern neural network architectures still struggle to learn algorithmic procedures that require to systematically apply compositional rules to solve out-of-distribution problem instances. In this work, we focus on formula simplification problems, a class of synthetic benchmarks used to study the systematic generalization capabilities of neural architectures. We propose a modular architecture designed to learn a general procedure for solving nested mathematical formulas by only relying on a minimal set of training examples. Inspired by rewriting systems, a classic framework in symbolic artificial intelligence, we include in the architecture three specialized and interacting modules: the Selector, trained to identify solvable sub-expressions; the Solver, mapping sub-expressions to their values; and the Combiner, replacing sub-expressions in the original formula with the solution provided by the Solver. We benchmark our system against the Neural Data Router, a recent model specialized for systematic generalization, and a state-of-the-art LLM (GPT-4) probed with advanced prompting strategies. We demonstrate that our approach achieves a higher degree of out-of-distribution generalization compared to these alternative approaches on three different types of formula simplification problems, and we discuss its limitations by analyzing its failures.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (30)
  1. Analyzing the nuances of transformers’ polynomial simplification abilities. In MATH-AI Workshop. International Conference on Learning Representations, May 2021.
  2. Exploring length generalization in large language models. In Sanmi Koyejo, S. Mohamed, A. Agarwal, Danielle Belgrave, K. Cho, and A. Oh, editors, Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28 - December 9, 2022, 2022.
  3. Learning of human-like algebraic reasoning using deep feedforward neural networks. Biol. Inspired Cogn. Arch., 25:43–50, August 2018.
  4. Learning to perform local rewriting for combinatorial optimization. In Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d’Alché-Buc, Emily B. Fox, and Roman Garnett, editors, Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, pages 6278–6289, 2019.
  5. The devil is in the detail: Simple tricks improve systematic generalization of transformers. In Marie-Francine Moens, Xuanjing Huang, Lucia Specia, and Scott Wen-tau Yih, editors, Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021, Virtual Event / Punta Cana, Dominican Republic, 7-11 November, 2021, pages 619–634. Association for Computational Linguistics, 2021.
  6. CTL++: evaluating generalization on never-seen compositional patterns of known functions, and compatibility of neural representations. In Yoav Goldberg, Zornitsa Kozareva, and Yue Zhang, editors, Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022, Abu Dhabi, United Arab Emirates, December 7-11, 2022, pages 9758–9767. Association for Computational Linguistics, 2022.
  7. The neural data router: Adaptive control flow in transformers improves systematic generalization. In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022. OpenReview.net, 2022.
  8. Rewrite systems. In Formal models and semantics, pages 243–320. Elsevier, 1990.
  9. Faith and fate: Limits of transformers on compositionality. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
  10. Connectionism and cognitive architecture: a critical analysis. Cognition, 28(1-2):3–71, March 1988.
  11. Hypernetworks. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. OpenReview.net, 2017.
  12. Compositionality decomposed: How do neural networks generalise? Journal of Artificial Intelligence Research, 67:757–795, 2020.
  13. Large language models are zero-shot reasoners. In Sanmi Koyejo, S. Mohamed, A. Agarwal, Danielle Belgrave, K. Cho, and A. Oh, editors, Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28 - December 9, 2022, 2022.
  14. Ekaterina Komendantskaya. Parallel rewriting in neural networks. In Proceedings of the International Joint Conference on Computational Intelligence (IJCCI 2009) - ICNC, pages 452–458. INSTICC, SciTePress, 2009.
  15. Human-like systematic generalization through a meta-learning neural network. Nature, 623(7985):115–121, November 2023.
  16. Deep learning for symbolic mathematics. ArXiv, abs/1912.01412, 2019.
  17. Systematic generalization and emergent structures in transformers trained on structured tasks. CoRR, abs/2210.00400, 2022.
  18. Memorize or generalize? searching for a compositional RNN in a haystack. CoRR, abs/1802.06467, 2018.
  19. Mastering algebra retrains the visual system to perceive hierarchical structure in equations. Cognitive research: principles and implications, 1:1–10, 2016.
  20. Sympy: symbolic computing in python. PeerJ Computer Science, 3:e103, January 2017.
  21. Is a modular architecture enough? ArXiv, abs/2206.02713, 2022.
  22. Listops: A diagnostic dataset for latent tree learning. In Silvio Ricardo Cordeiro, Shereen Oraby, Umashanthi Pavalanathan, and Kyeongmin Rim, editors, Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics, NAACL-HLT 2018, New Orleans, Louisiana, USA, June 2-4, 2018, Student Research Workshop, pages 92–99. Association for Computational Linguistics, 2018.
  23. Making transformers solve compositional tasks. In Smaranda Muresan, Preslav Nakov, and Aline Villavicencio, editors, Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, May 22-27, 2022, pages 3591–3607. Association for Computational Linguistics, 2022.
  24. OpenAI. Gpt-4 technical report, 2023.
  25. Measuring and narrowing the compositionality gap in language models. In Houda Bouamor, Juan Pino, and Kalika Bali, editors, Findings of the Association for Computational Linguistics: EMNLP 2023, pages 5687–5711, Singapore, December 2023. Association for Computational Linguistics.
  26. Randomized positional encodings boost length generalization of transformers. In Anna Rogers, Jordan L. Boyd-Graber, and Naoaki Okazaki, editors, Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), ACL 2023, Toronto, Canada, July 9-14, 2023, pages 1889–1903. Association for Computational Linguistics, 2023.
  27. Alberto Testolin. Can neural networks do arithmetic? a survey on the elementary numerical skills of state-of-the-art deep learning models. Applied Sciences, 2024.
  28. Attention is all you need. In Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fergus, S. V. N. Vishwanathan, and Roman Garnett, editors, Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, pages 5998–6008, 2017.
  29. Self-consistency improves chain of thought reasoning in language models. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net, 2023.
  30. Chain-of-thought prompting elicits reasoning in large language models. In Sanmi Koyejo, S. Mohamed, A. Agarwal, Danielle Belgrave, K. Cho, and A. Oh, editors, Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28 - December 9, 2022, 2022.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets