Quantum Knowledge Graph: Modeling Context-Dependent Triplet Validity

Published 27 Apr 2026 in cs.CL, cs.AI, and cs.SC | (2604.23972v1)

Abstract: Knowledge graphs (KGs) are increasingly used to support large lan guage model (LLM) reasoning, but standard triplet-based KGs treat each relation as globally valid. In many settings, whether a relation should count as evidence depends on the context. We therefore formulate triplet validity as a triplet-specific function of context and refer to this formulation as a Quantum Knowledge Graph (QKG). We instantiate QKG in medicine using a diabetes-centered PrimeKG subgraph, whose 68,651 context-sensitive relations are further annotated with patient-group-specific constraints. We evaluate it in a reasoner--validator pipeline for medical question answering on a KG-grounded subset of MedReason containing 2,788 questions. With Haiku-4.5 as both the Reasoner and the Validator, KG-backed validation significantly improves over a no-validator baseline ($+0.61$ pp), and QKG with context matching yields the largest gain, outperforming both KG validation without context matching ($+0.79$ pp) and the no-validator baseline ($+1.40$ pp; paired McNemar, all $p<0.05$). Under a stronger validator (Qwen-3.6-Plus), the raw QKG gain over the no-validator baseline grows from $+1.40$ pp to $+5.96$ pp; the context-matching gap is non-significant ($p=0.73$) on the raw set but becomes borderline significant ($p=0.05$) after adjustment for knowledge leakage and suspicious questions, consistent with a benchmark-gold ceiling rather than a QKG limitation. Taken together, the results support the view that the value of a KG in LLM-based clinical reasoning lies not merely in storing medically related facts, but in representing whether those facts are applicable to the specific patient context. For reproducibility and further research, we release the curated QKG datasets and source code.\footnote{https://github.com/HKAI-Sci/QKG}

Abstract PDF Upgrade to Chat

Authors (3)

Summary

The paper introduces the QKG framework that models triplet validity as a function of context, particularly for clinical applications.
It implements a reasoner-validator pipeline using LLMs, achieving statistically significant gains in medical question answering over traditional methods.
Experimental results highlight that context-sensitive evaluation boosts accuracy, promising more reliable clinical decision support.

Quantum Knowledge Graphs: Modeling Context-Dependent Triplet Validity

Motivation and Context

Conventional knowledge graphs (KGs), structured as triplets $(h,r,t)$ , typically encode fact validity as globally true or false, disregarding contextual subtleties. This paradigm is inadequate for applications where the applicability of a relation is inherently contingent on context—particularly in clinical settings, where patient-specific attributes like comorbidities, lab values, and disease stage modulate the relevance of medical facts. Extant KG extensions introduce qualifiers (e.g., temporal or hyper-relational attributes) and probabilistic validity, but they still fail to operationalize contextual applicability at sufficient granularity.

The Quantum Knowledge Graph (QKG) framework addresses this deficit by framing triplet validity as a function $F_\tau(C)$ , parameterized by context $C$ , thereby allowing the evaluation of $P(\tau|C)$ at inference time.

Figure 1: Context-dependent triplet validity in a Quantum Knowledge Graph, emphasizing the necessity of contextual triplet evaluation.

Framework Overview

QKG instantiates triplet validity via natural-language applicability constraints attached to relevant relations, particularly those whose clinical utility is context-sensitive: indication, contraindication, off-label use, and drug effect. To scale across heterogeneous and compositional clinical contexts, these constraints are not reduced to fixed structured fields but are preserved as natural-language records, compatible with LLM interpretation.

The operational pipeline comprises two agentic roles: a Reasoner (LLM) generates answers and supporting claims; a Validator leverages QKG, querying patient-context-matched applicability conditions to support or contradict claims. If validation contradicts an answer, the Reasoner revises its output.

Figure 2: Architecture of the QKG framework, contrasting conventional context-insensitive validity with QKG context-conditioned triplet evaluation.

Medical Domain Instantiation

The study curates a diabetes-centric subgraph from PrimeKG, focusing on patient-relevant relations and annotating 68,651 facts with patient-group-specific constraints via API-driven generation (Baichuan-M2-Plus). The MedReason dataset serves as the evaluation benchmark, filtered for KG-grounded coverage (2,788 samples). Patient context is extracted per sample, facilitating precise applicability filtering during validation.

Experimental Results

The primary evaluation metric is exact-match accuracy on medical question answering. Three settings are compared: (1) No validator baseline; (2) KG validation without context matching; (3) QKG validation with context matching. Haiku-4.5 and Qwen-3.6-Plus are employed as Reasoner and Validator, respectively.

QKG validation outperforms both baselines: under Haiku-4.5, the paired-significant gain over KG validation without context matching is +0.79 pp ( $p=0.014$ ), and QKG validation surpasses the no-validator baseline by +1.40 pp. With Qwen-3.6-Plus as Validator, QKG validation yields a larger raw gain (+5.96 pp), though knowledge leakage necessitates adjustment; after leakage control, the paired difference is borderline significant ( $p=0.05$ ).

Figure 3: Haiku-validator results and patient-context ablation, comparing final accuracy and revision statistics for the three evaluation settings.

Case studies reveal two modes of context-dependent correction: one combines multifactorial patient attributes to amplify risk; the other matches patient lab values to eligibility thresholds.

Figure 4: Compositional risk-amplifier and threshold-based contraindication cases, illustrating multifactorial and threshold-driven patient-context corrections in QKG validation.

Use of stronger validators exposes knowledge leakage effects—the validator may revise answers based on internal model knowledge absent explicit KG evidence. These effects are quantitatively controlled via leakage classification and adjusted accuracy reporting.

Figure 5: Qwen-validator results and ablation, showing the impact of context-conditioned QKG validation versus conventional KG and no-validator settings.

Implications and Future Directions

The results underscore that the value of KGs in LLM-driven reasoning transcends raw factual retrieval. It lies in encoding and operationalizing the conditions under which facts are valid, thereby supporting contextual claim verification—a crucial requirement for domains like medicine.

Practically, QKG enables agentic systems to filter evidence on patient characteristics, improving trustworthiness in clinical decision support. Theoretically, this motivates formalizations of KG validity as context-conditioned functions, inviting further exploration in representation learning, context-aware retrieval, and benchmark construction.

Future research should focus on real-world clinical reasoning benchmarks, where patient-context combinatorics are not fully captured by existing MCQ formats. Routine clinical data, annotated for reasoning traces and context applicability, will be critical to isolate QKG's contribution in operational environments. Model-driven knowledge leakage remains a concern in strong validators; improved architectures for disentangling KG evidence from parametric model knowledge are warranted.

Conclusion

Quantum Knowledge Graphs formalize triplet validity as a context-dependent quantity, operationalized via natural-language applicability conditions and evaluated by a reasoner–validator pipeline. Empirical results on KG-grounded medical QA show statistically significant gains from patient-context matching, confirming that context-sensitive triplet evaluation provides value beyond conventional fact retrieval. The approach advances KG–LLM integration, supporting more robust, context-aware reasoning in clinical and other domains—while highlighting the necessity of new benchmarks to assess real-world applicability.

Markdown Report Issue