Multi-Operational Mathematical Derivations in Latent Space (2311.01230v2)
Abstract: This paper investigates the possibility of approximating multiple mathematical operations in latent space for expression derivation. To this end, we introduce different multi-operational representation paradigms, modelling mathematical operations as explicit geometric transformations. By leveraging a symbolic engine, we construct a large-scale dataset comprising 1.7M derivation steps stemming from 61K premises and 6 operators, analysing the properties of each paradigm when instantiated with state-of-the-art neural encoders. Specifically, we investigate how different encoding mechanisms can approximate expression manipulation in latent space, exploring the trade-off between learning different operators and specialising within single operations, as well as the ability to support multi-step derivations and out-of-distribution generalisation. Our empirical analysis reveals that the multi-operational paradigm is crucial for disentangling different operators, while discriminating the conclusions for a single operation is achievable in the original expression encoder. Moreover, we show that architectural choices can heavily affect the training dynamics, structural organisation, and generalisation of the latent space, resulting in significant variations across paradigms and classes of encoders.
- Multi-relational poincaré graph embeddings. Advances in Neural Information Processing Systems, 32.
- Translating embeddings for modeling multi-relational data. Advances in neural information processing systems, 26.
- Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588.
- Deborah Ferreira and André Freitas. 2020. Premise selection in natural language mathematical texts. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7365–7374.
- To be or not to be an integer? encoding variables for mathematical text. In Findings of the Association for Computational Linguistics: ACL 2022, pages 938–948, Dublin, Ireland. Association for Computational Linguistics.
- Shortcut learning in deep neural networks. Nature Machine Intelligence, 2(11):665–673.
- Inductive representation learning on large graphs. Advances in neural information processing systems, 30.
- Efficient natural language response suggestion for smart reply. arXiv preprint arXiv:1705.00652.
- Sepp Hochreiter and Jürgen Schmidhuber. 1996. Lstm can solve hard long time lag problems. Advances in neural information processing systems, 9.
- Compositionality decomposed: How do neural networks generalise? Journal of Artificial Intelligence Research, 67:757–795.
- Yoon Kim. 2014. Convolutional neural networks for sentence classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1746–1751.
- Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. In International Conference on Learning Representations.
- Guillaume Lample and François Charton. 2019. Deep learning for symbolic mathematics. arXiv preprint arXiv:1912.01412.
- Yann LeCun. 2022. A path towards autonomous machine intelligence version 0.9. 2, 2022-06-27. Open Review, 62.
- Mathematical reasoning in latent space. In International Conference on Learning Representations.
- A survey of convolutional neural networks: analysis, applications, and prospects. IEEE transactions on neural networks and learning systems.
- A survey of deep learning for mathematical reasoning. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 14605–14631, Toronto, Canada. Association for Computational Linguistics.
- Jordan Meadows and André Freitas. 2023. Introduction to mathematical language processing: Informal proofs, word problems, and supporting tasks. Transactions of the Association for Computational Linguistics, 11:1162–1184.
- Generating mathematical derivations with large language models. arXiv preprint arXiv:2307.09998.
- A symbolic framework for systematic evaluation of mathematical reasoning with transformers. arXiv preprint arXiv:2305.12563.
- Sympy: symbolic computing in python. PeerJ Computer Science, 3:e103.
- Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.
- LILA: A unified benchmark for mathematical reasoning. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 5807–5832, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Numglue: A suite of fundamental yet challenging mathematical reasoning tasks. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 3505–3523.
- Graph representations for higher-order logic and theorem proving. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 2967–2974.
- Neural machine translation for mathematical formulae. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 11534–11550, Toronto, Canada. Association for Computational Linguistics.
- Testing the general deductive reasoning capacity of large language models using ood examples. arXiv preprint arXiv:2305.15269.
- Analysing mathematical reasoning abilities of neural models. In International Conference on Learning Representations.
- Towards out-of-distribution generalization: A survey. arXiv preprint arXiv:2108.13624.
- Multi-relational hyperbolic word embeddings from natural language definitions. arXiv preprint arXiv:2305.07303.
- TextGraphs 2022 shared task on natural language premise selection. In Proceedings of TextGraphs-16: Graph-based Methods for Natural Language Processing, pages 105–113, Gyeongju, Republic of Korea. Association for Computational Linguistics.
- Attention is all you need. Advances in neural information processing systems, 30.
- Naturalproofs: Mathematical theorem proving in natural language. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1).
- Symbolic brittleness in sequence models: on systematic generalization in symbolic mathematics. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 8629–8637.
- A review of recurrent neural networks: Lstm cells and network architectures. Neural computation, 31(7):1235–1270.