Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Learning to Solve Geometry Problems via Simulating Human Dual-Reasoning Process (2405.06232v1)

Published 10 May 2024 in cs.AI

Abstract: Geometry Problem Solving (GPS), which is a classic and challenging math problem, has attracted much attention in recent years. It requires a solver to comprehensively understand both text and diagram, master essential geometry knowledge, and appropriately apply it in reasoning. However, existing works follow a paradigm of neural machine translation and only focus on enhancing the capability of encoders, which neglects the essential characteristics of human geometry reasoning. In this paper, inspired by dual-process theory, we propose a Dual-Reasoning Geometry Solver (DualGeoSolver) to simulate the dual-reasoning process of humans for GPS. Specifically, we construct two systems in DualGeoSolver, namely Knowledge System and Inference System. Knowledge System controls an implicit reasoning process, which is responsible for providing diagram information and geometry knowledge according to a step-wise reasoning goal generated by Inference System. Inference System conducts an explicit reasoning process, which specifies the goal in each reasoning step and applies the knowledge to generate program tokens for resolving it. The two systems carry out the above process iteratively, which behaves more in line with human cognition. We conduct extensive experiments on two benchmark datasets, GeoQA and GeoQA+. The results demonstrate the superiority of DualGeoSolver in both solving accuracy and robustness from explicitly modeling human reasoning process and knowledge application.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (34)
  1. Gpt-4 technical report. arXiv preprint arXiv:2303.08774, 2023.
  2. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473, 2014.
  3. An augmented benchmark dataset for geometric question answering through dual parallel text encoding. In Proceedings of the 29th International Conference on Computational Linguistics, pages 1511–1520, 2022.
  4. Geoqa: A geometric question answering benchmark towards multimodal numerical reasoning. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 513–523, 2021.
  5. Unigeo: Unifying geometry logical reasoning via reformulating mathematical expression. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 3313–3323, 2022.
  6. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
  7. Cogview: Mastering text-to-image generation via transformers. Advances in Neural Information Processing Systems, 34:19822–19835, 2021.
  8. Glm: General language model pretraining with autoregressive blank infilling. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 320–335, 2022.
  9. Jonathan Evans. Dual-processing accounts of reasoning, judgment, and social cognition. Annual review of psychology, 59:255–278, 2008.
  10. Mask r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, pages 2961–2969, 2017.
  11. Masked autoencoders are scalable vision learners. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16000–16009, 2022.
  12. Long short-term memory. Neural Computation, 9:1735–1780, 1997.
  13. Daniel Kahneman. Thinking, fast and slow. macmillan, 2011.
  14. Solving math word problems with teacher supervision. In International Joint Conference on Artificial Intelligence, pages 3522–3528, 2021.
  15. Dual peccs: a cognitive system for conceptual representation and categorization. Journal of Experimental & Theoretical Artificial Intelligence, 29(2):433–452, 2017.
  16. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692, 2019.
  17. A cognitive solver with autonomously knowledge learning for reasoning mathematical answers. In 2022 IEEE International Conference on Data Mining (ICDM), pages 269–278. IEEE, 2022.
  18. Inter-gps: Interpretable geometry problem solving with formal language and symbolic reasoning. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 6774–6786, 2021.
  19. A symbolic characters aware model for solving geometry problems. In Proceedings of the 31st ACM International Conference on Multimedia, pages 7767–7775, 2023.
  20. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1):5485–5551, 2020.
  21. Learning to solve geometry problems from natural language demonstrations in textbooks. In Proceedings of the 6th Joint Conference on Lexical and Computational Semantics (* SEM 2017), pages 251–261, 2017.
  22. From textbooks to knowledge: A case study in harvesting axiomatic knowledge from textbooks to solve geometry problems. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 773–784, 2017.
  23. Controlled and automatic human information processing: I. detection, search, and attention. Psychological review, 84(1):1, 1977.
  24. Diagram understanding in geometry questions. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 28, 2014.
  25. Solving geometry problems: Combining text and diagram interpretation. In Proceedings of the 2015 conference on Empirical Methods in Natural Language Processing, pages 1466–1476, 2015.
  26. Attention is all you need. Advances in Neural Information Processing Systems, 30, 2017.
  27. Cogvlm: Visual expert for pretrained language models. arXiv preprint arXiv:2311.03079, 2023.
  28. A goal-driven tree-structured neural model for math word problems. In International Joint Conference on Artificial Intelligence, pages 5299–5305, 2019.
  29. Scene graph generation by iterative message passing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5410–5419, 2017.
  30. Embedmask: Embedding coupling for instance segmentation. In International Joint Conference on Artificial Intelligence, pages 1266–1273, 2021.
  31. Deep modular co-attention networks for visual question answering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6281–6290, 2019.
  32. Graph-to-tree learning for solving math word problems. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 3928–3937, 2020.
  33. Plane geometry diagram parsing. arXiv preprint arXiv:2205.09363, 2022.
  34. A multi-modal neural geometric solver with textual clauses parsed from diagram. In International Joint Conference on Artificial Intelligence, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Tong Xiao (119 papers)
  2. Jiayu Liu (38 papers)
  3. Zhenya Huang (52 papers)
  4. Jinze Wu (15 papers)
  5. Jing Sha (9 papers)
  6. Shijin Wang (69 papers)
  7. Enhong Chen (242 papers)

Summary

We haven't generated a summary for this paper yet.

Github Logo Streamline Icon: https://streamlinehq.com
X Twitter Logo Streamline Icon: https://streamlinehq.com