Papers
Topics
Authors
Recent
2000 character limit reached

HAGeo: Synthetic Geometry Solver

Updated 2 December 2025
  • HAGeo is a synthetic theorem-proving system that uses hand-engineered heuristics to generate non-trivial auxiliary constructions for Olympiad-level problems.
  • It methodically augments the geometry's deduction graph with points like midpoints, intersections, and reflections, enabling exhaustive symbolic and algebraic reasoning.
  • Empirical benchmarks show HAGeo’s CPU-based pipeline achieves a 93.3% success rate on IMO benchmarks, outperforming previous neural-symbolic methods.

HAGeo is a synthetic theorem-proving system for Olympiad-level Euclidean geometry that achieves gold-medal human performance on International Mathematical Olympiad (IMO) benchmarks using a purely heuristic, CPU-based pipeline. Unlike neural- or LLM-backed solvers, HAGeo operates by systematically augmenting the geometry's deduction graph with auxiliary constructions according to hand-engineered heuristics, allowing exhaustive symbolic and algebraic reasoning to proceed without learned priors. The system is designed to efficiently navigate the classically challenging space of synthetic geometry, focusing on the generation and selection of “non-trivial” auxiliary points—midpoints, intersections, reflections, and special feet—that unlock deductive chains unavailable to direct symbolic or algebraic search alone. HAGeo’s approach demonstrates that filtered random heuristics for auxiliary construction are sufficient to surpass previous neural-symbolic and classic AI approaches on both established and newly curated geometry problem benchmarks (Duan et al., 27 Nov 2025).

1. System Architecture and Deductive Workflow

HAGeo’s architecture is built around three main components: problem encoding, deductive-algebraic reasoning (DDAR), and iterative heuristic auxiliary construction. Problems are parsed into a geometry-specific formal language representing points, lines, circles, and their relationships. The DDAR engine then alternates between constructing deductive databases (DD) of symbolic consequences and running algebraic reasoning (AR) via Gaussian elimination on induced linear constraints among variables such as directed angles. If the goal is unreachable, HAGeo triggers up to KK iterations of heuristically generating candidate auxiliary points, recalculating the deduction graph after each batch. This pipeline runs entirely on CPUs in C++ with data structures optimized for sparsity and rapid lookup.

Pseudocode Overview:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Input: geometry problem P; budget K (aux attempts), rounds N.
Output: proof of P or fail.

1.  G  parse_problem(P)
2.  proof  DDAR(G)
3.  if proof succeeds then return proof
4.  for attempt = 1 to K do
5.     G  copy(G)
6.     for round = 1 to N do
7.       C  enumerate_candidates(G)
8.       F  {Q in C : Q non-trivial in G}
9.       if F ==  then break
10.      choose random Q  F
11.      G.add_object(Q.definition)
12.    end for
13.    proof  DDAR(G)
14.    if proof succeeds then return proof
15.  end for
16.  return fail
The symbolic DD and algebraic AR phases are rerun after each round of augmentation, yielding either the proof or failure after exhausting the search budget.

2. Heuristic Auxiliary Construction: Principles and Mechanisms

HAGeo’s success hinges on a carefully designed heuristic for selecting auxiliary constructions. A candidate point PP is accepted if it is computable by a classical operation and “entangles” at least two distinct pre-existing geometric objects, i.e.,

#{O:PO}2.\#\{\,O: P \in O\} \geq 2.

This “non-triviality” ensures that auxiliary points meaningfully connect otherwise independent sub-configurations.

The six concrete categories for heuristically generated points are:

  1. Intersection of any two among at least three existing lines or circles.
  2. Midpoint of ABAB if it also lies on another line or circle.
  3. Reflection of AA about BB if the reflected point lies on another object.
  4. Foot of the perpendicular from AA onto a line, if this point non-trivially lies elsewhere.
  5. Direct intersections from pairs among at least three existing objects.
  6. Unstructured random constructions (e.g., points forming equilateral triangles) to mitigate heuristic rigidity.

Filtering is followed by uniform random selection, sidestepping the need for learned ranking. New constructions are numerically validated in floating-point to discard degenerate cases, leveraging generic geometric embeddings. No neural inference is used at any stage (Duan et al., 27 Nov 2025).

3. Algorithmic and Algebraic Optimization

HAGeo implements several design and algorithmic optimizations targeting symbolic-deductive and algebraic reasoning efficiency:

  • Deductive-algebraic reasoning alternates DD with AR, the latter encoding angle equalities as sparse linear relations:

(dir(i)dir(j))(dir(k)dir(m))=0,(\mathrm{dir}(\ell_i) - \mathrm{dir}(\ell_j)) - (\mathrm{dir}(\ell_k) - \mathrm{dir}(\ell_m)) = 0,

where dir()\mathrm{dir}(\ell) is the directed angle of line \ell.

  • AR complexity is reduced by aggressive merging of equivalent variables; Gaussian elimination thus operates on a matrix whose size is halved.
  • Deduction rules are simplified to single-premise forms (e.g., two-angle equalities imply a third) to reduce combinatorial search.
  • Data structures include deduction graphs, sparse hash-maps for fact lookup and candidate testing, and dynamic sparse matrices for AR.

Typical complexity per deduction pass is O(RΔ)O(R \cdot \Delta) (with RR rules and Δ\Delta current graph size), and O(n3)O(n^3) for AR, where nn is the number of distinct variables, observed to be n100n \sim 100 after optimizations (Duan et al., 27 Nov 2025).

4. Empirical Results and Benchmarks

On the IMO-30 benchmark, HAGeo achieves superior success rates compared to previous symbolic and neural-symbolic systems:

Method Problems Solved (60s) Success Rate
DDAR alone 15/30 50%
AlphaGeometry 24/30 80%
Random-aux + DDAR 25/30 83.3%
HAGeo 28/30 93.3%

On the more challenging HAGeo-409 benchmark (with human-assessed difficulty from 1–7), HAGeo consistently outperforms AlphaGeometry, particularly on moderate to difficult problems (e.g., 83.0% success rate for HAGeo versus 39.3% for AlphaGeometry in the [3,4) difficulty range). Average running time per problem is 1.75 seconds on a 64-core CPU, which is a 24-fold speedup over AlphaGeometry’s DDAR (Duan et al., 27 Nov 2025).

5. Dataset Construction: HAGeo-409

To address the limitations of the small and relatively undemanding IMO-30 benchmark, HAGeo-409 was constructed, consisting of 409 geometry problems sourced from AoPS and ShuZhiMi, each with formal geometric encodings and human-rated difficulty. Only ~20% of problems were converted automatically from natural language using GPT-4o; the remainder required manual correction. Each problem’s statement was numerically validated in random embeddings to ensure non-degeneracy, and difficulty labels were derived from aggregated user ratings. The average difficulty is 3.47, with substantial representation across the full range (1–7), exposing system strengths and weaknesses more granularly (Duan et al., 27 Nov 2025).

6. Worked Example and Qualitative Behavior

A typical HAGeo run on IMO 2008 Problem 6 illustrates system operation:

  • The initial configuration and deduced consequences are parsed and processed via DDAR, unable to reach the goal.
  • The first heuristic round identifies viable midpoints (I1=mid(B,D)I_1 = \mathrm{mid}(B,D)), numerically confirmed to lie on significant circles.
  • The second round finds a related midpoint (I2=mid(C,D)I_2 = \mathrm{mid}(C,D)).
  • A subsequent DDAR pass, now utilizing these new objects, rapidly deduces requisite similar triangles and equal distances, closing the synthetic proof. Auxiliary points are only introduced when they non-trivially augment the configuration, as enforced by the filtering mechanism (Duan et al., 27 Nov 2025).

7. Limitations, Open Problems, and Future Directions

Several aspects of HAGeo’s architecture motivate future research and possible extensions:

  • The uniform random selection among valid auxiliaries may waste search budget; a learned or numerically-inspired scoring function could increase efficiency.
  • The palette of construction types is limited (e.g., omitting spiral similarity centers or isogonal conjugates); expanding this set could improve coverage but may incur combinatorial costs.
  • Proofs produced by DDAR may lack human-style readability; reordering passes could address this.
  • Search performance is sensitive to the heuristic budget, especially for highly challenging problems; smarter search heuristics (such as beam search or MCTS) may be beneficial.
  • Extension beyond planar Euclidean geometry, such as to 3D configurations or advanced theorems (e.g., involving inversion or circle packings), remains an open challenge (Duan et al., 27 Nov 2025).

A plausible implication is that purely heuristic, numerically filtered auxiliary construction—combined with efficient symbolic deduction—may suffice to approach, or surpass, the practical performance of learned or LLM-based geometric reasoners in the domain of Olympiad-level synthetic geometry.


References:

  • "Gold-Medal-Level Olympiad Geometry Solving with Efficient Heuristic Auxiliary Constructions" (Duan et al., 27 Nov 2025)
Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Whiteboard

Follow Topic

Get notified by email when new papers are published related to HAGeo.