Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning (2105.04165v3)

Published 10 May 2021 in cs.CL, cs.AI, cs.CV, and cs.FL

Abstract: Geometry problem solving has attracted much attention in the NLP community recently. The task is challenging as it requires abstract problem understanding and symbolic reasoning with axiomatic knowledge. However, current datasets are either small in scale or not publicly available. Thus, we construct a new large-scale benchmark, Geometry3K, consisting of 3,002 geometry problems with dense annotation in formal language. We further propose a novel geometry solving approach with formal language and symbolic reasoning, called Interpretable Geometry Problem Solver (Inter-GPS). Inter-GPS first parses the problem text and diagram into formal language automatically via rule-based text parsing and neural object detecting, respectively. Unlike implicit learning in existing methods, Inter-GPS incorporates theorem knowledge as conditional rules and performs symbolic reasoning step by step. Also, a theorem predictor is designed to infer the theorem application sequence fed to the symbolic solver for the more efficient and reasonable searching path. Extensive experiments on the Geometry3K and GEOS datasets demonstrate that Inter-GPS achieves significant improvements over existing methods. The project with code and data is available at https://lupantech.github.io/inter-gps.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Pan Lu (42 papers)
  2. Ran Gong (17 papers)
  3. Shibiao Jiang (1 paper)
  4. Liang Qiu (36 papers)
  5. Siyuan Huang (123 papers)
  6. Xiaodan Liang (318 papers)
  7. Song-Chun Zhu (216 papers)
Citations (154)

Summary

  • The paper introduces Inter-GPS, a novel method that converts geometry problems into formal language for symbolic reasoning.
  • It presents the Geometry3K dataset with 3,002 annotated problems to benchmark and advance geometry problem solving.
  • Experimental results show Inter-GPS outperforms baselines with 57.5% accuracy, emphasizing its potential for educational applications.

Inter-GPS: Interpretable Geometry Problem Solving through Formal Language and Symbolic Reasoning

This paper introduces a substantial contribution to the field of artificial intelligence focused on geometry problem solving by proposing a novel approach named Inter-GPS (Interpretable Geometry Problem Solver). This approach leverages formal language and symbolic reasoning for solving high-school level geometry problems, an area that remains challenging due to its demands for abstract reasoning and application of axiomatic knowledge.

Key Contributions and Approach

  1. Benchmark Dataset: The authors present the Geometry3K dataset, aimed at advancing research in geometry problem solving. This dataset comprises 3,002 problems annotated with formal language, providing a comprehensive resource for testing solution algorithms. Unlike previous datasets, Geometry3K is publicly available and covers a wide range of geometric shapes and problem goals.
  2. Formal Language Representation: The core innovation behind Inter-GPS is its ability to convert the problem descriptions into a formal language. This transformation facilitates symbolic reasoning by enhancing the interpretability of problem representations. The formal language captures the geometry problem's semantics through predicates and literals, making it suitable for logical processing.
  3. Problem Parsing and Symbolic Reasoning: Inter-GPS first parses text and diagrams using rule-based parsing mechanisms and neural object detection, respectively, to produce formal descriptions. Following this, it engages in explicit symbolic reasoning, utilizing a theorem knowledge base to incrementally update the problem's relational set, driving towards a solution. This step-by-step theorem application mimics human problem-solving strategies in a manner that is both efficient and interpretable.
  4. Theorem Predictor: An innovative component of Inter-GPS is the theorem predictor, which forecasts an efficient sequence of theorems that can be applied to reach a solution, thus optimizing the search process. This sequence generation is achieved through a trained transformer model, augmenting the problem-solving efficiency.
  5. Evaluation and Results: Extensive experimentation showcases the effectiveness of Inter-GPS, presenting significant improvements over competing methods. It achieves an accuracy of 57.5% on the Geometry3K dataset, outperforming neural network-based baselines. Additionally, synthetic experiments demonstrate that parsing accuracy and theorem selection are critical to the solver’s success, with Inter-GPS achieving higher performance compared to existing solvers on traditional datasets like GEOS.

Implications and Future Directions

This research exemplifies how formal language integration and symbolic reasoning can enrich AI problem-solving methodologies in domains requiring articulate logic and interpretability. It presents implications for automated education tools where geometry problem-solving capabilities could foster enhanced learning experiences. Additionally, by setting new benchmarks and introducing effective strategies for hybrid AI systems that integrate symbolic and neural approaches, this work may inspire continued exploration into symbolic AI's role in complex cognitive tasks.

Moving forward, improving the parsers' accuracy and expanding the theorem knowledge base could bridge the gap between current solver capabilities and human expert-level problem-solving. There is also potential to extend this approach to other mathematical domains, thereby broadening the scope of intelligent educational technologies. The insights gleaned from this research can robustly contribute to theoretical advancements in AI interpretability and domain-specific learning systems.