Papers
Topics
Authors
Recent
2000 character limit reached

Safeterm Trial-Safety App Overview

Updated 14 December 2025
  • The Safeterm Trial-Safety App is a clinical trial safety analytics platform that integrates transformer-based MedDRA encoding with automated signal detection and PRO instrument optimization.
  • It utilizes unsupervised semantic clustering, spectral analysis, and advanced visualizations to balance patient burden with comprehensive adverse event coverage.
  • The modular design ensures seamless integration with clinical workflows, enabling reproducible safety assessments and query generation for enhanced regulatory review.

The Safeterm Trial-Safety App is a web- and API-based platform for clinical trial safety analytics and patient-reported outcome (PRO) instrument optimization. Safeterm leverages a high-dimensional transformer-based embedding model to encode MedDRA Preferred Terms (PTs), integrating historical adverse event (AE) data, semantic mapping, clustering, utility-driven selection, and advanced visualizations to support automated signal detection, PRO-CTCAE design, and knowledge-based review. This approach streamlines patient burden–signal coverage trade-offs, enables unsupervised or reproducible MedDRA query generation, and enriches trial data interpretation for sponsors and regulatory professionals (Vandenhende et al., 7 Dec 2025, Vandenhende et al., 8 Dec 2025, Vandenhende et al., 8 Dec 2025, Vandenhende et al., 24 Nov 2025).

1. System Architecture and Data Flow

The Safeterm Trial-Safety App operates through modular backend and frontend components:

  • Frontend (Web Client): Users interact via a React/TypeScript interface, inputting historical AE profiles as MedDRA PT lists (with optional incidence counts). The app provides ranked PRO-CTCAE candidate tables, interactive plots (2D projections, leverage vs. rank), and CSV/Excel export (Vandenhende et al., 7 Dec 2025).
  • Backend API: Implemented using Python (FastAPI/Flask), it exposes RESTful endpoints (e.g., /select_pro for PRO-CTCAE selection, AMQ endpoints for MedDRA queries) that orchestrate mapping, embedding, scoring, clustering, and spectral selection pipelines.
  • Data Stores: SQL/NoSQL databases hold MedDRA dictionaries, mapping tables linking PRO-CTCAE items to PTs, and the Safeterm embedding model (PyTorch, d=300).
  • Outputs: Structured JSON returns candidate term rankings (relevance, utility, diversity, leverage), recommended cut-offs (k_opt), and scores, with direct export and browser-based visualization capabilities.

This architecture supports seamless integration with EDC/pharmacovigilance workflows and enables interactive data-driven refinement for safety monitoring, PRO selection, and query generation.

2. MedDRA Mapping and Semantic Embedding

PRO-CTCAE to MedDRA Mapping: Each PRO-CTCAE symptom (≈124 plain-language items) is manually mapped by expert terminologists to one or two MedDRA PTs, resolving lexical ambiguity via LLTs; this preserves the original PRO intent while providing semantic linkage (Vandenhende et al., 7 Dec 2025).

Safeterm Embedding Model: All MedDRA PTs are encoded in a transformer-based model, trained on large biomedical corpora and MedDRA hierarchy, yielding normalized vectors ePTR300\mathbf{e}_{PT}\in \mathbb{R}^{300}. This embedding space forms the basis for all semantic computations (cosine similarity, clustering, diversity scoring).

Semantic Similarity: For two normalized vectors xx, yy, similarity is cosine(x,y)=xycosine(x,y) = x \cdot y; broader relationships (clinical, mechanistic, linguistic) are captured beyond strict MedDRA hierarchy (Vandenhende et al., 24 Nov 2025).

3. Relevance, Utility, and Diversity Ranking

Relevance Scoring:

  • Redundancy among PRO items: S=EPROEPROTS = E_{PRO}\cdot E_{PRO}^T, Si,j=cosine(ei,ej)[0,1]S_{i,j} = cosine(e_i,e_j) \in [0,1].
  • Relevance to AE history: Q=EtrialEPROTQ = E_{trial}\cdot E_{PRO}^T, Qi,j=cosine(etriali,ePROj)Q_{i,j} = cosine(e_{trial_i},e_{PRO_j}).
  • Raw relevance: Rj=maxiQi,jR_j = \max_i Q_{i,j}.
  • Incidence weighting: Wj=i:Qi,j>αmaxiQi,jwiW_j = \sum_{i: Q_{i,j} > \alpha\cdot \max_i Q_{i,j}} w_i, with α=0.9\alpha=0.9.

Utility Function:

  • Saturated relevance: Rj=1/(1+ek(Rjx0))R^*_j = 1/(1+e^{-k(R_j-x_0)}) (k=20k=20, x0=0.8x_0=0.8).
  • Combined utility: Uj=Rj+β(Wj/maxjWj)U_j = R^*_j + \beta \cdot (W_j/\max_j W_j), β=0.1\beta=0.1.

L-kernel for Utility/Diversity: From Determinantal Point Process theory, Li,j=UiSi,jUjL_{i,j} = U_i \cdot S_{i,j} \cdot U_j, L=diag(U)Sdiag(U)L = \text{diag}(U)S\text{diag}(U). Diagonal entries capture utility; off-diagonals encode semantic overlap penalties.

4. Spectral Analysis and Orthogonal Symptom Selection

Eigen-Decomposition and Explained Variance:

  • L=VΛVTL = V\Lambda V^T, where Λ=diag(λ1,...,λNPRO)\Lambda = \text{diag}(\lambda_1,...,\lambda_{N_{\text{PRO}}}), VV orthonormal eigenvectors.
  • Cumulative explained variance: CV(j)=(i=1jλi)/(i=1NPROλi)CV(j) = (\sum_{i=1}^j \lambda_i)/(\sum_{i=1}^{N_{\text{PRO}}} \lambda_i).
  • Minimal orthogonal set size: kopt=min{jCV(j)info_threshold}k_{opt} = \min\{j | CV(j) \geq \text{info\_threshold}\} ($0.90$–$0.975$ typical).

Diversity Leverage Score:

  • For item jj: Leveragej=i=1kopt(Vj,i)2\text{Leverage}_j = \sum_{i=1}^{k_{opt}} (V_{j,i})^2.
  • Items are rank-ordered by leverage, enforcing selection of the top koptk_{opt} for coverage across all axes.

5. Automated MedDRA Query Generation and Validation

Safeterm incorporates AMQ (Automated Medical Query) features for MedDRA term retrieval (Vandenhende et al., 8 Dec 2025, Vandenhende et al., 8 Dec 2025):

  • Workflow: Free-text query or MedDRA PT input \to embedding \to cosine similarity computation \to extreme-value (two-means) clustering \to knee-point threshold selection \to ranked PT candidate list.
  • Thresholding: Lower thresholds (e.g., 0.50–0.60) maximize recall (≈0.94 for SMQs, ≈0.95 for OCMQs); higher thresholds (0.70–0.90) increase precision (up to 0.89 for SMQs, 0.86 for OCMQs), sacrificing recall.
  • Performance: For the optimal F1 threshold (\sim0.70): SMQ recall 0.48/precision 0.45/F1 0.44; OCMQ recall 0.57/precision 0.34/F1 0.37.
  • Narrow-term PTs: Require slightly higher similarity thresholds, maintain recall, slightly reduced precision by gold set size.
  • Recommendations: Use valid MedDRA PTs as queries, adjust thresholds to match sensitivity/specificity needs, integrate with EDC systems for real-time query generation and review.

6. Visualization, Knowledge Layer, and Clustering

Hidden Medical Knowledge Layer: Safeterm augments MedDRA PTs with high-dimensional embeddings, semantic descriptors, and precomputed pairwise cosine similarities, forming a latent relationship graph (Vandenhende et al., 24 Nov 2025).

Automatic Clustering:

  • Trial-observed PT embeddings are reduced (PCA) and clustered via agglomerative or k-means algorithms.
  • Cluster identity is decoded via AI translators from embedding centroids; ungrouped PTs (low silhouette scores) are flagged and colored distinctly.

Shrinkage Incidence Ratio (SIR) and Cluster-Level EBGM:

  • Expected count: Eij=NinjNE_{ij} = N_i \frac{n_{\cdot j}}{N_{\cdot}}; SIR SIRij=nij+αEij+βSIR_{ij} = \frac{n_{ij} + \alpha}{E_{ij} + \beta} (gamma-Poisson shrinkage).
  • Cluster-level aggregation: Precision-weighted mean EBGMi,C=jCwjSIRijjCwj\mathrm{EBGM}_{i,C} = \frac{\sum_{j\in C} w_j SIR_{ij}}{\sum_{j\in C} w_j}, wj=nij+αSIRij2w_j=\frac{n_{ij}+\alpha}{SIR_{ij}^2}.

Visualization Outputs:

  • Semantic Map: 2D PCA/t-SNE projection of PTs, colored by semantic cluster, sized by incidence rate; interactive filtering and tooltip details.
  • Expectedness-versus-Disproportionality Plot (EVD): X-axis: expectedness (cosine similarity to disease indication vector), Y-axis: SIRijSIR_{ij}. Points colored by cluster, sized by incidence. Outliers (low expectedness, high SIRSIR) denote novel safety signals.

7. Empirical Results and Practical Integration

Monte Carlo Simulations (N=100,000): Mean recall 0.70, precision 0.72, F1 0.70 (info threshold 97.5%), stable across signal/noise levels (Vandenhende et al., 7 Dec 2025).

Oncology Case Study (Multiple Myeloma):

  • Phase I: Algorithm selected kopt=16k_{opt}=16 PRO-CTCAE items; all matched AE PTs; 9 exact-matches flagged and excluded for redundancy.
  • Phase II: Automated list overlapped with 8 of 15 manual PROs; coverage was comparable (auto 11/16, manual 11/15 retrieved).
  • Automated selection provided objective, reproducible design and explicit burden–coverage justification.

Legacy Trials with Semantic Clustering:

  • Duchenne Muscular Dystrophy: Liver damage cluster detected (semantic map, cluster-level EBGM); minor hepatotoxicity signals enriched.
  • Narcolepsy Dose-Response: Dose-dependent stress cluster SIR rise detected.
  • Hodgkin’s Lymphoma: Bone marrow failure cluster differentiated between treatments.

Practical Recommendations:

  • Start broad signal detection at moderate thresholds, refine for specificity as needed.
  • Leverage semantic clustering and visualization for hypothesis generation and transparent safety review.
  • Integrate app endpoints with clinical EDC, pharmacovigilance, and dashboard systems.

Safeterm transforms trial safety workflows by embedding MedDRA PTs in a semantically calibrated hidden space, enabling objective, reproducible PRO selection, rapid and unsupervised term query generation, and advanced clustering-based signal analysis, validated across diverse oncology and neurology trials (Vandenhende et al., 7 Dec 2025, Vandenhende et al., 8 Dec 2025, Vandenhende et al., 8 Dec 2025, Vandenhende et al., 24 Nov 2025).

Whiteboard

Follow Topic

Get notified by email when new papers are published related to Safeterm Trial-Safety App.