Benchmarks for ATPs using ETP data
Develop well-calibrated benchmarking suites for automated theorem provers that leverage the Equational Theories Project dataset; specify evaluation protocols, metrics, and datasets that meet community standards for assessing ATP performance on equational reasoning at scale.
Sponsor
References
The objective of using the data from the ETP to establish well-calibrated benchmarks to evaluate ATPs remains an interesting open problem; the participants of this project did not have the required expertise to develop and test such benchmarks to the standards expected in the area.
— The Equational Theories Project: Advancing Collaborative Mathematical Research at Scale
(2512.07087 - Bolan et al., 8 Dec 2025) in Section Outcomes (Introduction, Subsection Outcomes)