MedC-K: Knowledge Injection in Medical NLP

Updated 27 November 2025

MedC-K is a framework that injects structured clinical knowledge into neural models for radiology report generation, addressing factual accuracy and domain specificity.
It employs a dual-branch architecture with Weighted Concept Knowledge (WCK) and Multimodal Retrieval Knowledge (MRK) to integrate clinical terminology and structured relation triplets.
Fusion via cross-attention mechanisms combines image and textual features, leading to marked improvements in metrics like BLEU, ROUGE-L, and CIDEr.

Knowledge injection (MedC-K) encompasses a family of techniques designed to explicitly introduce structured or unstructured external knowledge into neural generative models and LLMs, with a central focus on improving factual accuracy, mitigating knowledge incompleteness, and tailoring outputs for knowledge-intensive biomedical domains. MedC-K, as formalized in “Enhanced Knowledge Injection for Radiology Report Generation” (Li et al., 2023), is representative of a larger trend in medical NLP toward integrating multiple curated knowledge streams—such as domain-specific concepts, retrieval-augmented facts, and structured symbolic information—directly into neural architectures, typically via attention-based or fusion mechanisms.

1. Fundamental Concepts and Motivation

Neural models for language generation and understanding in the biomedical domain frequently suffer from limited factuality and incompleteness, owing to the long-tail nature of medical terminology and the inherent gap between statistical training data and expert-curated domain knowledge. Knowledge injection addresses these limitations by integrating external knowledge into the model’s input, intermediate representations, or even parameters. The MedC-K framework leverages this paradigm specifically for radiology report generation, targeting challenges such as:

Semantic gap between image features and clinical language.
Lack of domain specificity, especially for rare or subtle clinical findings.
Improved reasoning over structured relationships (e.g., entity–position–existence) often absent in non-injected generative models.

2. MedC-K Architecture: Dual Knowledge Branches

MedC-K implements a two-branch knowledge-injection architecture:

2.1 Weighted Concept Knowledge (WCK):

Maintains a set of $N_c=76$ high-priority clinical concepts.
Computes TF-IDF weighting of concepts per report (training) or retrieved report set (test):

$\mathrm{TFIDF}(c, d) = \frac{n_{c,d}}{\sum_k n_{k,d}} \times \log \frac{|R|}{1 + |\{j: w_c \in r_j\}|}$

Embeds each concept using ClinicalBERT, producing $F_c \in \mathbb{R}^{B \times N_c \times d}$ .
Applies soft selection on concept embeddings: $K_c = F_c \odot S_c$ .

2.2 Multimodal Retrieval Knowledge (MRK):

Employs a frozen MGCA model for image-wise retrieval of top- $k=3$ semantically similar images.
Extracts reports from these images and mines relation triplets $\{\textrm{entity},\,\textrm{position},\,\textrm{existence}\}$ via Rad-Graph and AGXNet.
Verbalizes triplets to text prompts (e.g., “No [entity]”, “[Position] is [entity]”).
Encodes these prompts with ClinicalBERT to obtain $K_t \in \mathbb{R}^{B \times N_T \times d}$ .

The dual branches capture both the salient terminology weighted for relevance (WCK) and structured, position-annotated clinical findings (MRK).

3. Fusion Mechanisms and Representation Learning

The fused feature representation $F'_I$ combines image and textual knowledge streams as follows:

$F'_I = F_I + \mathrm{Att}(F_I, K_c, K_c) + \mathrm{Att}(F_I, K_t, K_t)$

where $F_I$ is the ResNet101 image feature map, and each $\mathrm{Att}(\cdot)$ is a cross-attention operation. This mixture-of-knowledge (MoK) fusion enables rich, multi-source feature aggregation prior to report generation, which is performed via a Transformer decoder attending over $F'_I$ . Only a single cross-entropy (language modeling) loss is used.

4. Experimental Design and Evaluation

MedC-K was evaluated on two public chest X-ray datasets: IU-Xray and MIMIC-CXR. Core details:

Visual backbone: ResNet101 with frontal and/or lateral views.
Concept encoder: ClinicalBERT, $d=768$ .
Retrieval: MGCA-pretrained, $k=3$ neighbors.
Transformer generator: 3 encoder, 3 decoder layers.

Metrics include BLEU-1..4, METEOR, ROUGE-L, CIDEr. On both datasets, MedC-K surpasses the strongest prior methods by up to +1.9 BLEU-1 and +0.021 ROUGE-L (IU-Xray), and consistently improves or matches SOTA on almost all metrics. Ablation demonstrates:

Knowledge Variant	BLEU-1	BLEU-4	ROUGE-L	METEOR	CIDEr
Base Transformer	0.466	0.168	0.360	0.188	0.427
+ plain concepts	0.477	0.174	0.376	0.200	0.527
+ weighted (WCK)	0.492	0.189	0.380	0.207	0.629
+ triplets (MRK)	0.500	0.171	0.379	0.211	0.524
WCK + MRK (full MedC-K)	0.516	0.207	0.400	0.222	0.608

The largest gains are attributable to weighted concept injection (CIDEr +0.202), with incremental benefit from triplet-based MRK for metrics sensitive to phrase precision.

MedC-K is situated among a spectrum of medical knowledge-injection approaches:

Kformer (Yao et al., 2022): Expands FFN “key–value memory” with projected embeddings of retrieved knowledge, focusing on textual reasoning (e.g., SocialIQA, MedQA-USMLE).
SA-MDKIF (Xu et al., 1 Feb 2024): Modularizes medical “skills” as AdaLoRA-tuned adapters and fuses them via a skill router for LLM adaptability across tasks (e.g., entity extraction, summarization).
GKI-ICD (Zhang et al., 24 May 2025): Aggregates code descriptions, synonyms, and hierarchy into synthetic “guideline” inputs, leveraging multitask learning for ICD coding without architectural modification.
DKINet (Liu et al., 2023): Injects subgraph-based UMLS knowledge into EHR-based patient representations using filter-based aggregation and fusion attention.

MedC-K is distinct in its modality (vision–language), the dual-branch integration of both weighted clinical concepts and position-structured relation triplets, and its empirical focus on radiology report generation.

6. Limitations and Future Directions

Principal limitations identified in (Li et al., 2023) include:

Retrieval noise: MGCA-based retrieval may inject semantically similar but clinically uninformative reports.
Concept/triplet scope: Triplets capture only entity presence and position, omitting severity or temporal progression.
Generalization challenge: TF-IDF weighting may underrepresent rare entities that surface only at test time.

Suggested future extensions:

Learnable gating or attention mechanisms for concept/triplet filtering.
Integration with structured medical knowledge graphs (e.g., UMLS, RadLex) for expanded relations and granularity.
Contrastive alignment between knowledge and image representations.
N-tuple extension for richer entity-attribute modeling.

7. Impact and Outlook

By combining TF-IDF-weighted concept injection with structured, retrieval-based triplets, and fusing these through attention mechanisms, MedC-K demonstrates substantial improvements in both lexical accuracy and clinical factuality in radiology report generation (Li et al., 2023). Its dual-branch approach exemplifies a general design pattern for MedNLP: simultaneous aggregation of salient terminology and structured relational knowledge. This paradigm has been empirically validated as broadly effective not just for radiology, but across medical QA, ICD coding, medication recommendation, and note generation, as evidenced by recent advances in knowledge-injection frameworks (Yao et al., 2022, Liu et al., 2023, Xu et al., 1 Feb 2024, Zhang et al., 24 May 2025). A continued trend is anticipated toward multi-source, modular, and dynamically weighted knowledge injection, further closing accuracy gaps in biomedical language understanding and generation.