Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 86 tok/s
Gemini 2.5 Pro 38 tok/s Pro
GPT-5 Medium 23 tok/s
GPT-5 High 29 tok/s Pro
GPT-4o 94 tok/s
GPT OSS 120B 464 tok/s Pro
Kimi K2 166 tok/s Pro
2000 character limit reached

Conformal Language Model Reasoning with Coherent Factuality (2505.17126v1)

Published 21 May 2025 in cs.CL, cs.LG, and stat.ML

Abstract: LLMs are increasingly being used in important decision pipelines, so ensuring the correctness of their outputs is crucial. Recent work has proposed evaluating the "factuality" of claims decomposed from a LLM generation and applying conformal prediction techniques to filter out those claims that are not factual. This can be effective for tasks such as information retrieval, where constituent claims may be evaluated in isolation for factuality, but is not appropriate for reasoning tasks, as steps of a logical argument can be evaluated for correctness only within the context of the claims that precede them. To capture this, we define "coherent factuality" and develop a conformal-prediction-based method to guarantee coherent factuality for LLM outputs. Our approach applies split conformal prediction to subgraphs within a "deducibility" graph" that represents the steps of a reasoning problem. We evaluate our method on mathematical reasoning problems from the MATH and FELM datasets and find that our algorithm consistently produces correct and substantiated orderings of claims, achieving coherent factuality across target coverage levels. Moreover, we achieve 90% factuality on our stricter definition while retaining 80% or more of the original claims, highlighting the utility of our deducibility-graph-guided approach.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Ai Generate Text Spark Streamline Icon: https://streamlinehq.com

Paper Prompts

Sign up for free to create and run prompts on this paper using GPT-5.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube