Safeterm Map: Visualizing Safety Signal Clusters
- Safeterm Map is a graphical tool that combines MedDRA taxonomy with hidden semantic embeddings to analyze treatment-emergent adverse events.
- It leverages PCA and hierarchical clustering to reduce dimensions and automatically group semantically similar AE terms.
- The method overlays disproportionality metrics and indication-specific expectedness scores to rapidly identify and prioritize safety signals.
A Safeterm map is a graphical method for visualizing and analyzing treatment-emergent adverse events (AEs) in clinical trials. It integrates MedDRA's hierarchical AE taxonomy with a high-dimensional hidden knowledge layer ("Safeterm") that encodes clinical, mechanistic, and contextual relationships between AE terms. This approach enables efficient, accurate clustering and rapid detection of safety signals by projecting AEs into a semantically meaningful 2-D map, automatically regrouping related terms, and overlaying disproportionality metrics and indication-specific expectedness for interpretation (Vandenhende et al., 24 Nov 2025).
1. Safeterm Hidden-Knowledge Layer and Motivation
MedDRA provides a fixed, hierarchical taxonomy but lacks the ability to represent latent clinical or mechanistic relationships that clinicians use during AE review. The Safeterm layer augments MedDRA by mapping each Preferred Term (PT) to a high-dimensional embedding, learned from medical text corpora, ontologies, and drug-event databases. These embeddings capture semantic similarity beyond the strict parent-child MedDRA hierarchy. Consequently, PTs that are not MedDRA siblings but are semantically or mechanistically related (e.g., various grades of liver-test elevation) are positioned closer in the embedding space.
Traditional AE review relies on visual inspection of long lists of PTs or the use of predefined Standardized MedDRA Queries (SMQs), which can miss nuanced or mechanistically related signals. The addition of the Safeterm layer enables three primary enhancements:
- Automatic clustering of semantically similar PTs
- Projection onto a readable 2-D map
- Overlay of both disproportionality (signal strength) and disease-specific expectedness, supporting rapid and context-sensitive signal triage
This approach improves clarity (related AEs are grouped), efficiency (reviewers avoid redundant scanning), and accuracy (signals aggregate across similar events).
2. Construction and Semantics of the Safeterm Map
Each MedDRA PT is encoded as a vector (with several hundred), representing its multi-modal semantic and contextual profile. Principal Component Analysis (PCA) is applied on the subset of PT embeddings observed in the trial, retaining the top axes (typically ) for denoising and cluster separability. For visualization, coordinates are further reduced to the first two PCA components; each PT is plotted as in this 2-D space.
Semantic closeness between PTs and is measured by Euclidean distance in the reduced space:
Here, denotes projection onto PCA axes.
For indication-specific expectedness, each PT’s embedding is compared to a trial population descriptor (extracted from protocol) using cosine similarity:
Hierarchical agglomerative clustering (Ward’s method) on the -dimensional PCA-reduced vectors is used, with clusters cut at a user-defined height to generate semantically coherent groups. An AI-based decoder provides human-interpretable cluster labels from centroids in the original embedding space.
3. Quantitative Metrics for Signal Detection
Signal quantification within the Safeterm map proceeds via shrinkage incidence ratios, employing Empirical Bayes Geometric Mean (EBGM) computations adapted from the Gamma–Poisson Shrinker paradigm.
Let and denote, respectively, the count of subjects experiencing PT and the number at risk in arm . The expected count is:
The shrinkage incidence ratio for PT , arm is:
where are hyper-parameters for stabilizing low counts (standard for empirical Bayes).
The EBGM at the PT level is thus:
Variance for each is approximated as:
The precision weight is .
The cluster-level EBGM for arm , cluster , is the precision-weighted average:
For interpretation, the Expectedness vs. Disproportionality (EVD) plot represents each PT as .
4. Visual Outputs and Interpretation
Two primary visualizations support interpretation:
- Safeterm Map: Each PT appears as a dot in the plane; color denotes cluster membership, size represents incidence proportion. Unclustered PTs are shown in brown. Typical use overlays trial arms or facets by arm. Clusters form contiguous clouds, enabling visual identification of treatment-specific AE patterns.
- EVD Plot: The x-axis is expectedness (cosine similarity to trial population), and the y-axis is . Dot size indicates incidence; color encodes the cluster. This facilitates rapid differentiation among:
- High disproportionality, low expectedness: candidate unexpected safety signals
- High disproportionality, high expectedness: on-indication increases
- Low disproportionality, low expectedness: no unusual findings
5. Case Studies and Empirical Validation
The Safeterm map method was validated retrospectively on three legacy clinical trials:
- NCT05096221 – AAV gene therapy (Duchenne Muscular Dystrophy): Of 72 PTs, 44 were assigned to 7 clusters. The “Liver damage” cluster (e.g., GGT increased, hyperbilirubinemia), located centrally-left on the map, had all constituent PTs with in active arms, with peak values –$3$, accurately reflecting the known liver safety signal. Cluster-level mirrored this effect.
- NCT02348593 – JZP-110 in narcolepsy: 18 PTs grouped into two clusters (“Stress response” and “Respiratory infection”). The stress response cluster exhibited monotonic increases in both incidence and with higher dosing, matching known pharmacodynamics.
- NCT05008224 – KEYNOTE-C11 lymphoma trial: 104 PTs formed 12 clusters. The “Bone marrow failure” cluster tracked the escBEACOPP arm, highlighting confounding related to therapy allocation.
In all examples, the Safeterm workflow automatically recovered expected safety signals without manual grouping, reduced review time by over 50% compared to SMQ- or SOC-based tabulations, and improved signal-to-noise characteristics by consolidating spurious single-PT signals into cluster-level metrics.
6. Practical and Methodological Significance
The addition of the Safeterm layer to MedDRA enables a seamless, automated, and context-aware workflow for AE signal detection in clinical trials. The framework’s ability to encode clinical semantics, cluster PTs dynamically, and integrate trial-specific expectedness scores enhances the rigor of AE review. Its graphical outputs provide intuitive, yet analytically justifiable, panels for rapid prioritization and triage of safety findings. The approach increases reviewer efficiency, clarity, and accuracy over legacy enumeration or rule-based SMQ grouping. While a formal Bayesian hierarchical extension is not yet implemented, the current method provides a robust empirical Bayes summary for both PT- and cluster-level signals (Vandenhende et al., 24 Nov 2025).
7. Limitations and Future Extensions
The current Safeterm map approach utilizes pre-trained embeddings and empirical Bayesian shrinkage to stabilize low-count estimates. No additional Bayesian hierarchical model was fitted, though the underlying framework supports extension to full two-level modeling. The determination of cluster granularity remains user-defined, which may affect sensitivity to rare or heterogeneous signals. The method has demonstrated value across diverse clinical contexts—including gene therapy, stimulant, and oncology trials—but ongoing validation in other disease areas and integration with real-time surveillance systems remains warranted.