Molecular Tumor Boards Overview
- Molecular Tumor Boards are collaborative, multidisciplinary forums in precision oncology that integrate clinical, pathological, and molecular data to guide personalized cancer care.
- They follow a structured workflow from pre-meeting data consolidation and specialist discussion to consensus formulation and iterative review as new information emerges.
- Emerging AI and deep learning systems enhance MTB efficiency by automating data extraction, refining biomarker prediction, and supporting rigorous evaluation of clinical decisions.
Molecular Tumor Boards (MTBs) are multidisciplinary forums in precision oncology, designed for collaborative evaluation of complex cancer cases through integrated assessments of clinical, pathological, and molecular data. Operating as structured environments with specialists—oncologists, pathologists, geneticists, radiologists, molecular biologists, and bioinformaticians—MTBs facilitate evidence-driven decision-making for personalized treatment recommendations. MTB workflows are iterative, updating analyses as new inputs (such as biopsy results or sequencing data) become available, ensuring optimal patient-specific management (Codella et al., 8 Sep 2025, Vasilev et al., 25 Nov 2025).
1. Multidisciplinary Structure and Workflow of MTBs
The canonical MTB process involves pre-meeting preparation with review and summarization of heterogeneous patient records, followed by team discussion, recommendation formulation, and post-meeting implementation:
- Pre-meeting preparation: A clinician or designated medical assistant collates structured (EHR, lab, genomic results) and unstructured (clinical notes) patient documents, generating a concise summary for team orientation.
- Discussion phase: Each specialist contributes domain insights. Pathologists interpret H&E and IHC slides; geneticists present molecular profiles such as somatic mutations and copy-number changes; radiologists and oncologists synthesize findings into diagnostic timelines and suggest therapeutic options or clinical trial eligibility.
- Consensus and iteration: MTBs revisit cases as more data are acquired, with recommendations dynamically adjusted to reflect new information, mirroring the longitudinal nature of cancer management.
- Documentation and action: Agreed recommendations are formally recorded and implemented as treatment plans.
Critical challenges in this workflow are substantial clinician time investment (mean preparation times: 81.7 min for radiologists, 144.0 min for pathologists) and substantial variability/omissions in data abstraction—especially in comorbidities, patient perspectives, and molecular details. Insufficient summary quality risks suboptimal decisions or missed actionable findings (Codella et al., 8 Sep 2025, Vasilev et al., 25 Nov 2025).
2. Data Modalities and Sequential Reasoning in MTBs
MTBs operate over diverse, complex data modalities:
| Data Modality | Examples | MTB Role |
|---|---|---|
| Digital Pathology | H&E slides, multiple IHC stains (CD3, CD8, CD68, CD163) | Tumor characterization, immune infiltration |
| Genomics | Somatic mutation calls, CNV, fusion events, MSI, TMB | Biomarker-driven therapy selection |
| Laboratory/Hematology | CRP, MPV, leukocyte subsets, creatinine, coagulation profile | Risk stratification, pre-treatment workup |
| Clinical Timelines/Notes | Demographics, prior therapies, surgical and outcome reports | Diagnostic/prognostic chronology |
MTB workflows are explicitly longitudinal. Information is revealed incrementally, forcing sequential decision-making: initial histopathological diagnosis is followed by preoperative laboratory investigation and post-surgical integration of additional modalities. This process encourages hypothesis refinement and recommendation revision as new evidence emerges (Vasilev et al., 25 Nov 2025). In computational modeling (e.g., MTBBench), agents replicate this by managing evolving timelines and context memory as case stages progress.
3. Agentic Frameworks and LLM-Driven MTB Support Systems
Recent advances demonstrate agentic AI architectures for supporting or automating MTB workflows. The Healthcare Agent Orchestrator (HAO) is a modular, LLM-driven, multi-agent system mapping closely to real MTB roles:
- Specialized agents (PatientHistory, Radiology, Pathology, ClinicalTrials) extract domain-specific information from corresponding data sources, using zero-shot or specialized LLM prompts.
- Orchestrator: Determines agent invocation per case complexity, manages turn-taking and summary collation, and enforces verification checkpoints for output validity.
- Workflow example: The orchestrator issues granular tasks (e.g., "Generate a timeline including diagnosis date, histology, biomarkers"), agents return citation-backed outputs, and verification steps audit factual traceability to the original records.
- System architecture integrates clinical tools (EHR queries, PACS viewers), institutional knowledge bases, and secure tool/data connectivity using the Model Context Protocol (MCP) (Codella et al., 8 Sep 2025).
A plausible implication is that agentic orchestration enables precision, traceability, and error localization in patient summary generation, addressing common pitfalls of manual abstraction.
4. Computational Evaluation and Benchmarking in MTB Environments
Measuring the fidelity of AI-generated MTB outputs demands robust, claim-level evaluation strategies. The "model-as-a-judge" paradigm implemented in TBFact uses LLMs to decompose summaries into atomic, verifiable clinical facts, with bidirectional entailment scoring:
- Recall under strict entailment: Given reference facts, each is scored as (full, partial, or no entailment).
- Metric: .
- Performance: The PatientHistory agent achieved a TBFact recall of $0.84$ (strict) and included of high-importance information with partial credits (Codella et al., 8 Sep 2025).
- Precision, recall, F1 are stratified by information importance; errors are attributed as either omissions or unsupported additions, enabling granular analysis.
- Benchmarks (MTBBench) simulate MTB-style reasoning over multimodal, longitudinal cases, measuring model accuracy and information-seeking behavior (file access count), and are validated via expert annotation (Cohen's for ground truth agreement) (Vasilev et al., 25 Nov 2025).
This suggests that systematic, data-free, local evaluation frameworks such as TBFact are feasible for institutional deployment, reducing the need to share sensitive clinical data.
5. Deep-Learning Integration for Molecular Biomarker Prediction in MTBs
Advances in deep learning enable inference of molecular biomarkers from digital pathology (H&E), augmenting MTB capabilities:
- Bottom-up MIL approaches: Whole-slide images (WSI) are tiled, encoded by backbone CNNs, and aggregated with attention pooling (e.g., DeepMIL), yielding slide-level predictions. Emerging transformer-based MIL (TransMIL) approaches further advance performance.
- Pathologist-driven and hybrid models: Incorporate hand-crafted features (nuclear segmentation, tissue types) or combine them with learned embeddings.
- Supervised variants: Utilize co-registered IHC to provide tile-level ground truth.
- Mathematics: Binary cross-entropy (BCE) drives slide-level learning, auxiliary adversarial losses enforce domain invariance, contrastive self-supervision improves generalization.
- Typical workflow: Tiling, stain normalization, feature extraction, attention-based aggregation, thresholding per biomarker, heatmap reporting for MTB review. Example pseudocode illustrates stepwise integration into MTB pipelines.
- Validation: Metrics include AUC, sensitivity, specificity, and F1, computed at the slide/patient level and validated both internally and with external cohorts (Couture, 2022).
The integration of such tools enables cost-effective pre-screening, data-driven recommendations for confirmatory molecular testing, and quantification of tumor heterogeneity for MTB consideration.
6. Impact, Limitations, and Future Directions
Agentic AI systems like HAO and benchmarking approaches such as MTBBench have demonstrated strengths:
- Alignment with real-world MTB team composition and interaction protocols;
- Traceable, citation-backed summary generation reducing information loss and subjective error;
- Potential for substantial reductions in clinician preparation time;
- Rigorous, claim-level evaluation frameworks tolerant of stylistic variation and focused on factual completeness.
Limitations acknowledged include evaluation being restricted to single-specialist agents, reliance on LLM judgments without exhaustive human adjudication, and coverage limited to dataset-verifiable facts. MTB benchmarking demonstrates persistent challenges for current LLMs in multimodal, longitudinal reasoning, with frequent hallucinations and difficulty reconciling conflicting evidence (Codella et al., 8 Sep 2025, Vasilev et al., 25 Nov 2025).
Future work directions include multi-agent, end-to-end performance assessment, attribution of omissions/distortions to specific agents, live or simulated MTB workflow deployments to measure real clinician editing burden and time savings, and metric refinement with category-wise breakdowns for biomarkers, imaging, and therapies. Expansion to automated staging, pathology interpretation, and trial matching workflows is expected.
A plausible implication is that continued advances in agentic orchestration, integration of multimodal deep learning, and rigorous evaluation will further enhance MTB support systems, improving both reliability and scalability in precision oncology.