MedDRA Preferred Terms (PTs) Overview

Updated 15 December 2025

MedDRA Preferred Terms (PTs) are standardized clinical concepts that serve as the primary descriptors for adverse event coding across regulatory and safety reporting systems.
They enable precise mapping and grouping of adverse events through automated and manual methods, leveraging techniques like fuzzy matching and hierarchical ontology traversal.
Advanced computational frameworks use semantic similarity measures and Bayesian models to enhance signal detection and reproducibility in pharmacovigilance workflows.

MedDRA Preferred Terms (PTs) are standardized descriptors used to encode clinical concepts in drug safety, regulatory reporting, and pharmacovigilance. MedDRA, the Medical Dictionary for Regulatory Activities, is structured as a five-level hierarchical ontology: System Organ Class (SOC), High Level Group Terms (HLGT), High Level Terms (HLT), Preferred Terms (PT), and Lowest Level Terms (LLT). PTs serve as atomic clinical concepts—each PT (e.g., “Nausea,” “Peripheral oedema”) subsumes one or more LLTs (synonyms, lexical variants). PTs are the primary entry point for both manual and automated coding of adverse events (AEs) in structured product labels, spontaneous reporting systems, clinical trials, and safety analytics. In advanced pharmacovigilance workflows, PTs can be computationally mapped, grouped, and analyzed using semantic similarity measures (SSMs), hierarchical Bayesian models, and high-dimensional embeddings, yielding enhanced sensitivity in signal detection, reduced manual review, and reproducible term selection.

1. Structure and Representation of MedDRA PTs

All MedDRA PTs are catalogued in the ontology as unambiguous clinical concepts. As of MedDRA v26.1, the PT inventory encompasses 26,409 terms (Painter et al., 26 Mar 2025). PTs are defined via their unique Concept Unique Identifiers (CUIs) in the Unified Medical Language System (UMLS) Metathesaurus, which enables interlinking with external ontologies such as SNOMED-CT and MeSH. This networked representation serves as the basis for advanced relationship traversal and semantic decomposition, e.g., mapping “gastric ulcer” to its semantic atoms (“ulcer” + “stomach”). PTs aggregate multiple LLTs; for example, LLT “itching” and LLT “pruritus” both map to PT “Pruritus” (Painter et al., 26 Mar 2025). PVLens and similar systems automate the mapping of extracted SPL text spans to PTs using dictionary lookups, UMLS queries, and fuzzy string matching, achieving high recall (98.3%) and F1 (88.2%) at the PT level (Painter et al., 26 Mar 2025).

2. Semantic Similarity Measures for Clustering PTs

MedDRA PTs can be clustered and quantitatively grouped using ontology-based SSMs (Painter et al., 26 Mar 2025, Haguinet et al., 16 Apr 2025). Formal definitions include:

Intrinsic Information Content (IC): For concept $c$ ,

$\mathrm{IC}(c) = -\log\frac{|\mathrm{desc}(c)| + 1}{|\mathrm{AllConcepts}|}$

Path-Based Measures:
- Wu & Palmer (WUPALMER): compares depth from root to lowest common ancestor (LCA).
- Leacock & Chodorow (LCH): computes shortest-path length in the ontology.
Intrinsic IC-Based Measures:
- Resnik (sim $_\mathrm{res}$ ): $\mathrm{IC}(\mathrm{LCA}(c_1, c_2))$
- Lin (sim $_\mathrm{lin}$ ): $\frac{2\,\mathrm{IC}(\mathrm{LCA}(c_1, c_2))}{\mathrm{IC}(c_1)+\mathrm{IC}(c_2)}$
- Sokal (sim $_\mathrm{sok}$ ): $\frac{\mathrm{IC}(\mathrm{LCA}(c_1, c_2))}{2[\mathrm{IC}(c_1)+\mathrm{IC}(c_2)]-3\,\mathrm{IC}(\mathrm{LCA}(c_1, c_2))}$

Empirically, intrinsic IC-based SSMs—INTRINSIC_LIN, INTRINSIC_LCH, and SOKAL—achieve higher F1 for cluster prediction (0.403–0.404) relative to path-based methods (0.28–0.36) and align better with expert review (Cohen’s κ of 0.60–0.75) (Painter et al., 26 Mar 2025). These SSMs are foundational for data-driven groupings, automated query generation, and borrowing in Bayesian analyses (Haguinet et al., 16 Apr 2025, Vandenhende et al., 8 Dec 2025).

3. Computational Frameworks for PT Quantification and Clustering

High-throughput analysis of PTs exploits RESTful APIs, Java backends (based on Apache cTAKES YTEX), and Python/R clients for pairwise SSM computation at scale (Painter et al., 26 Mar 2025). For automated PT selection or grouping (e.g., PRO-CTCAE item reduction (Vandenhende et al., 7 Dec 2025)), PTs are embedded in high-dimensional vector spaces (“Safeterm” editor's term), enabling spectral analysis, orthogonal subspace selection, and diversity-leverage ranking. Principal component analysis (PCA) or t-SNE reduces dimensions for visualization (Vandenhende et al., 8 Dec 2025).

Clustering workflows use agglomerative thresholding: for a given cluster centroid (often a PT that labels a Standardised MedDRA Query), PTs above a similarity threshold form the predicted set. Evaluation employs metrics such as ROC curves, precision, recall, F1, and Cohen’s κ, referencing expert medical review and MedDRA definitions (Painter et al., 26 Mar 2025).

Table: SSMs and Evaluated Cluster Performance (Painter et al., 26 Mar 2025)

SSM Variant	F1 (Expert Ref)	Cohen’s κ (Expert Ref)
WUPALMER	0.360	0.45–0.55
LCH	0.280	0.45–0.55
INTRINSIC_LIN	0.403	0.60–0.75
INTRINSIC_LCH	0.404	0.60–0.75
SOKAL	0.403	0.60–0.75

4. Automated Query Generation and PT Selection

Custom MedDRA queries and PT subsets are generated using AI-driven methods (SafeTerm (Vandenhende et al., 8 Dec 2025)). Workflows include:

Embedding PTs and raw queries into a shared vector space (D ≃ 768).
Fuzzy Matching for exact PT names with Levenshtein similarity ≥0.90.
Dual-cosine Scoring: $(s_1, s_2)$ for query/PT and PT/best PT.
Extreme-value Clustering: k-means ( $k=2$ ) isolates highly relevant PTs.
Thresholding: Sets based on $s_2$ ; $\theta=0.60$ for broad recall, $\theta=0.75$ for higher precision.
Manual refinement using ranking and “knee point” selection.

Empirical performance yields recall $>$ 95% at moderate thresholds, precision up to 86% at high thresholds, and F1 scores of 0.39 (mean) against FDA-collated gold standards (Vandenhende et al., 8 Dec 2025). This suggests reproducible, version-agnostic PT selection pipelines applicable in regulatory and industrial safety signal detection.

5. PTs in Signal Detection and Bayesian Borrowing

In disproportionality analysis (DPA) for pharmacovigilance, PTs anchor contingency tables for drug–event pair statistics. The latest Bayesian dynamic borrowing (BDB) approach enables continuous, similarity-informed information sharing among clinically related PTs (Haguinet et al., 16 Apr 2025). For a target PT, similar PTs are identified using SSMs (typically Sokal measure) and inform dynamic MAP priors:

MAP prior mean and variance:

$\widehat{\mu}_m = \frac{\sum_{j=1}^S \left(s_{Ij} / VIC_j\right)\;IC_j}{\sum_{j=1}^S s_{Ij}/VIC_j}$

$\widehat{V}_m = \frac{\sum_{j=1}^S (s_{Ij})^2 / VIC_j}{\left(\sum_{j=1}^S s_{Ij}/VIC_j \right)^2}$

Dynamic borrowing factor:

$w_I = \max_j s_{Ij}$

Performance in FAERS shows IC SSM sensitivity of 57.0% (Youden’s J=0.246) vs. IC (50.1%, J=0.250), with earlier signal detection by ∼1.75 quarters. While F1 lags slightly relative to traditional IC, SSM-based borrowing raises true positive rates—especially during early post-marketing periods (Haguinet et al., 16 Apr 2025).

6. PT Mapping, Annotation, and Manual Coding

Automated and manual PT mapping protocols traverse LLT→PT dictionaries, UMLS concept lookups, string/n-gram matching, and fuzzy similarity scoring (e.g., Levenshtein, pair distance) (Painter et al., 26 Mar 2025, Combi et al., 2015). PVLens employs a hybrid candidate ranking algorithm with confidence thresholds (θ=0.90, δ=0.10 margin), routing ambiguous spans to expert review for adjudication and feedback-driven dictionary expansion. MagiCoder applies voting across ADR tokens, scoring LLT candidates with weighted multi-criteria (coverage, density, distribution), with selection logic that prefers longer, more specific matches and exact prefix coverage (Combi et al., 2015). These systems typically reach ≥60% exact PT agreement with human coding; ongoing enhancements target negation detection, synonym expansion, and orthography robustness.

7. Implications, Limitations, and Ongoing Adaptations

Semantic clustering of PTs enables mechanism-level grouping of adverse events beyond hierarchical constraints, supporting early signal detection and workload reduction—IC-based SSMs decrease manual reviewer burden by over 40% while retaining ≥80% recall (Painter et al., 26 Mar 2025). Automated PT selection and query expansion improve reproducibility and coverage, with deterministic spectral frameworks controlling burden/coverage trade-offs quantitatively (Vandenhende et al., 7 Dec 2025). Limitations include dependency on the completeness of PT dictionaries and historical event sets, MedDRA version drift, and residual manual oversight for ambiguous mappings or novel concepts (Painter et al., 26 Mar 2025, Vandenhende et al., 7 Dec 2025).

A plausible implication is that future approaches will increasingly integrate real-time data sources, continually retrain semantic models, and refine utility functions using patient-reported outcomes and workflow metrics. Continuous version updates, learning from adjudicator input, and dynamic thresholding are critical for maintaining high-quality PT-level signal analytics and regulatory compliance.

Key References:

"Ontology-based Semantic Similarity Measures for Clustering Medical Concepts in Drug Safety" (Painter et al., 26 Mar 2025)
"PVLens: Enhancing Pharmacovigilance Through Automated Label Extraction" (Painter et al., 26 Mar 2025)
"Automagically encoding Adverse Drug Reactions in MedDRA" (Combi et al., 2015)
"Semantic Similarity-Informed Bayesian Borrowing for Quantitative Signal Detection of Adverse Events" (Haguinet et al., 16 Apr 2025)
"Automated PRO-CTCAE Symptom Selection based on Prior Adverse Event Profiles" (Vandenhende et al., 7 Dec 2025)
"Automated Generation of Custom MedDRA Queries Using SafeTerm Medical Map" (Vandenhende et al., 8 Dec 2025)
"An Evaluation Benchmark for Adverse Drug Event Prediction from Clinical Trial Results" (Yazdani et al., 19 Apr 2024)