Detecting errors at the sub-claim level in LLM reasoning chains
Develop a method to identify and certify errors within sub-claim components of large language model–generated reasoning chains, eliminating the assumption that claims are pre-decomposed into atomic units and enabling fine-grained soundness assessment while addressing the associated computational costs.
References
Additionally, our approach assumes that the claims are already decomposed and, therefore, cannot detect errors at the sub-claim level. We leave this for future work, noting it would increase computational costs.
— Probabilistic Soundness Guarantees in LLM Reasoning Chains
(2507.12948 - You et al., 17 Jul 2025) in Limitations section