Submitted-Trial Loop: Iterative Trial Matching
- Submitted-Trial Loop is an iterative workflow for patient–trial matching that redefines eligibility as a dynamic state rather than a binary decision.
- The system ingests heterogeneous EHR data, retrieves candidate trials through multi-step API queries, and evaluates each patient–trial pair using an LLM-enabled template.
- A four-state schema—Eligible Now, Could Be Eligible in Future, Need More Information, and Not Eligible—guides expert review and future re-assessment.
Submitted-Trial Loop denotes an iterative workflow for patient–clinical trial matching in which eligibility is treated as a dynamic, reviewable state rather than a one-shot automated verdict. In the proof-of-concept system that most explicitly articulates the concept, patient data are ingested from the EHR, candidate trials are broadly retrieved, each patient–trial pair is evaluated by a locally deployed reasoning-enabled LLM, the results are packaged for expert review, and cases that are not immediately enrollable are retained as interpretable states such as “Could Be Eligible in Future” or “Need More Information,” together with next-step recommendations. The central significance of this formulation is that it reframes trial matching from static classification into an iterative operational process (Leach et al., 8 Dec 2025).
1. Definition and conceptual basis
The Submitted-Trial Loop arises from a specific critique of conventional clinical trial screening. The underlying paper describes the traditional workflow as manual, slow, and dependent on specialized coordinator expertise: staff search trial registries, often through keyword search in ClinicalTrials.gov, and then manually compare complex inclusion and exclusion criteria against fragmented patient records combining structured fields and unstructured notes. Two bottlenecks are emphasized. First, trial discovery is difficult because registry search depends on carefully chosen terms and lacks semantic similarity support, so equivalent disease descriptions may fail to retrieve the right studies. Second, eligibility adjudication is burdensome because EHR evidence is heterogeneous, incomplete, and clinically nuanced. The resulting delays narrow matching efforts and create missed opportunities for both patients and trial recruitment (Leach et al., 8 Dec 2025).
Within that setting, the Submitted-Trial Loop is the paper’s operational answer to the observation that many patients are neither clearly eligible nor permanently excluded at first pass. Instead of collapsing each patient–trial pair into a binary label, the workflow preserves contingent opportunities, surfaces missing evidence, and supports later reconsideration. The loop is therefore not a philosophical reformulation of eligibility; it is an operational design in which screening, expert review, follow-up action, status update, and rematching become repeated cycles (Leach et al., 8 Dec 2025).
2. Pipeline architecture
The architecture is described as a four-step pipeline. First, heterogeneous EHR data are ingested and processed by a Patient Information Extraction template using DeepSeek, a reasoning-enabled open-source LLM, with prompts rendered through Jinja2. Initial experiments used synthetic patient records deliberately constructed to resemble real EHR variability, including missing fields, free text, tables, and PDF-style layouts, especially for genomic reports. A dedicated parsing module acts as preprocessing to normalize these varying formats before LLM extraction. The output is a structured patient report spanning 14 clinical categories: Primary Diagnosis, Base Diagnosis, Diagnosis Synonyms, Patient Demographics, Current Interventions, Treatment History, Search Terms, Biomarkers / Molecular Profile, Performance Status, Laboratory Values, Comorbidities, Family History, Treatment Goals, and Eligibility Factors (Leach et al., 8 Dec 2025).
Second, the structured patient report drives candidate retrieval from the ClinicalTrials.gov REST API. The retrieval strategy is explicitly multi-pass and recall-oriented. The system does not rely on a single narrow diagnosis string; it first queries the exact primary diagnosis, then broader diagnosis terminology and diagnosis synonyms, while constraining by recruiting status, age, sex, and default geography within the United States. Results from multiple API calls are consolidated, deduplicated, and stored trial-by-trial as structured JSON objects containing metadata such as NCT_ID and Trial_Title together with protocol fields including description, inclusion criteria, and exclusion criteria. In the proof-of-concept run on 30 synthetic EHRs, candidate retrieval averaged 950 trials per patient, which the authors describe as roughly a tenfold increase after synonym-based expansion (Leach et al., 8 Dec 2025).
Third, each patient is evaluated against each candidate trial by a second templated LLM prompt, the Patient-Trial Eligibility Evaluator. This template operationalizes review in three sections: minimum eligibility criteria, inclusion criteria, and exclusion criteria. The model is instructed to confirm demographic prerequisites, compare the broader clinical profile against inclusion rules, assess exclusion conditions separately, and produce supporting reasoning, certainty based on data completeness and clarity, and explicit identification of information gaps. Fourth, the system transforms the raw JSON output into Word and PDF reports organized in a hierarchical documentation style. Each report includes metadata such as assessment date, trial_ID, patient_ID, and assessor_information, meaning which model and which template version were used. This structure is presented as essential for auditability and workflow interpretability (Leach et al., 8 Dec 2025).
3. Dynamic eligibility states and resubmission logic
The mechanism that makes the workflow a loop is its four-state output schema. Rather than reducing a patient–trial pair to an irreversible yes/no determination, the system aggregates criterion-level reasoning into one of four overall states (Leach et al., 8 Dec 2025).
| State | Operational meaning |
|---|---|
| “Eligible Now” | Key criteria are met |
| “Could Be Eligible in Future” | Current deficits are temporal or remediable |
| “Not Eligible” | Key criteria fail |
| “Need More Information” | Decisive data are absent or ambiguous |
“Could Be Eligible in Future” captures cases where protocol prerequisites are not yet met but may become satisfied with time, treatment completion, additional tests, or clarification of uncertain findings. “Need More Information” captures unresolved cases where missing evidence prevents a confident conclusion. These states preserve candidate trials that would otherwise be lost in a reject-only workflow and convert trial matching into a resubmission process (Leach et al., 8 Dec 2025).
The paper’s pancreatic adenocarcinoma example illustrates the logic. A synthetic patient was assessed against trial NCT05764720. The model recognized that the patient had completed only one two-week cycle of FOLFIRINOX, while the trial required at least two months of chemotherapy. Rather than assign “Not Eligible,” the system returned “Could Be Eligible in Future” with medium confidence. It also identified unresolved items including imaging confirmation, the ability to interrupt systemic therapy, and breath-hold capability, and it generated corresponding follow-up recommendations: complete chemotherapy, verify therapy interruption feasibility, obtain imaging, and test breath-hold capacity. In loop form, such outputs become triggers for future rematching after the patient’s status changes (Leach et al., 8 Dec 2025).
4. Interpretability, human review, and deployment constraints
Interpretability is implemented at both model and output levels. DeepSeek is described as an open-source reasoning-capable model with a 128,000-token context window and intermediate reasoning steps in > tags. In this system, that capability is used to articulate how specific patient characteristics support or violate specific trial criteria. The structured JSON schema shown in the paper includes fields such as "eligibility_summary", "eligibility_status", "confidence_level", "primary_criteria_assessment", "clinical_criteria_assessment", "exclusion_criteria_assessment", "actionable_recommendations", and "missing_data_points". Criterion-level sub-assessments include a name or criterion, a status, and a reasoning string, allowing reviewers to inspect the logic beyond the final verdict (Leach et al., 8 Dec 2025).
Human-in-the-loop review is the intended operating mode, not a fallback. The system narrows and structures the matching problem, but clinicians and coordinators remain responsible for validating matches, resolving ambiguities, and deciding whether to pursue a study. High-certainty matches can be verified efficiently, while “Need More Information” and “Could Be Eligible in Future” cases direct expert attention to missing tests, unresolved protocol nuances, or opportunities for later reevaluation. Subject matter experts were involved in designing both templates, including the extracted variables and the interpretation of trial criteria. Future work identified in the paper includes explicit source citation down to report date and page number, a web-based interface for expert review, and systematic capture of reviewer feedback that could eventually be used for reinforcement learning (Leach et al., 8 Dec 2025).
Security, privacy, and auditability are treated as core deployment constraints. The paper explicitly rejects dependence on externally hosted commercial LLMs for PHI-bearing workflows because HIPAA compliance cannot be guaranteed in that setting. Instead, the system uses open-source models deployed locally on institutionally approved infrastructure within a NIST-compliant data center, with strict data-use controls and containerized deployment. Auditability is supported through structured outputs, report generation, and metadata capture, and the paper emphasizes “comprehensive auditability of all AI-generated outputs” as a design goal (Leach et al., 8 Dec 2025).
5. Empirical behavior, scalability, and limitations
The evaluation is a proof-of-concept on synthetic EHRs rather than a clinical validation study. Across 30 synthetic patient records, the system produced 28,575 patient–trial evaluations. Of these, 25,192, or 88%, were labeled “Not Eligible,” which the authors interpret as clinically plausible because most trials do not fit most patients and trial availability is uneven across disease types. Sixteen of the 30 patient personas received at least one “Eligible Now” trial. The paper also gives two scale examples: a synthetic Hodgkin lymphoma patient was evaluated against over 5,000 candidate trials and yielded eight matches, whereas a synthetic patient with adenoid cystic carcinoma of the salivary gland was reviewed against nearly 800 trials and yielded one match (Leach et al., 8 Dec 2025).
Scalability is supported by modular orchestration and multithreaded execution. Intermediate and final structured data are represented in JSON, post-processing and report generation are handled with Python scripts, and the main orchestration is described as an overarching
run_pipelinescript. The authors justify multithreading on the grounds that LLM pipelines are inference-latency bound, so concurrent completions improve throughput. The architecture is also intentionally modular: the extraction template can be extended with new data elements, retrieval can search the full registry or evaluate a pre-identified trial list, and the eligibility template can be adjusted for stricter or more permissive reasoning modes (Leach et al., 8 Dec 2025).The limitations are equally explicit. The system has only been evaluated on synthetic data. Source-level traceability is incomplete because extracted facts are not yet explicitly linked back to precise source locations. There is not yet a dedicated web interface for natural expert review and correction. Retrieval still depends on handcrafted search-expansion logic because the underlying registry lacks semantic search. Broad retrieval also increases downstream computational burden, and any LLM-based system may misread clinical nuance, especially in ambiguous or incomplete records. For those reasons, the paper insists on human review and conservative recommendation categories (Leach et al., 8 Dec 2025).
6. Related loop formulations in adjacent trial informatics
Although the Submitted-Trial Loop is most explicitly articulated in AI-assisted patient–trial matching, related loop structures appear elsewhere in the clinical-trial informatics literature. “CliniDigest” maps naturally onto a submitted-trial monitoring loop in which newly registered ClinicalTrials.gov studies are repeatedly batched, recursively summarized, and surfaced for ongoing review; its reported implementation is a batch summarization method rather than a specified streaming architecture (White et al., 2023). This suggests a complementary loop focused on surveillance of incoming trials rather than matching patients to them.
A different but related layer is evidence verification. The NLI4CT shared task formalizes claim adjudication over clinical trial reports as linked tasks of evidence selection and entailment, requiring multi-evidence biomedical and numerical reasoning. This suggests that a Submitted-Trial Loop can include not only retrieval and matching, but also fact-level verification of trial claims and supporting evidence (Jullien et al., 2023).
Loop-based logic also appears in trial protocol control. “Dose-Escalation Trial Protocols that Extend Naturally to Admit Titration” models dose assignment as a monotone map from observed trial states to doses and uses a right Kan extension to convert rigid cohort rules into a rolling-enrollment workflow with pending outcomes and within-patient titration (Norris, 2 Jul 2025). In another direction, “ClinicalReTrial” frames protocol redesign itself as a closed-loop, reward-driven optimization problem in which failed protocols are iteratively diagnosed, modified, safety-filtered, evaluated in a learned simulation environment, and resubmitted within a bounded search budget (Xing et al., 1 Jan 2026). These adjacent formulations do not use the Submitted-Trial Loop label in the same way, but they show that loop-based trial workflows can govern monitoring, evidence adjudication, enrollment control, and protocol redesign as well as patient matching.
Taken together, these related systems suggest that the Submitted-Trial Loop is best understood not as a single software artifact but as an operational pattern: a trial-related object is submitted in a provisional state, evaluated against structured criteria, returned with interpretable status and actionable feedback, and then reconsidered after new evidence, new protocol state, or new data become available. In the primary patient-matching formulation, that object is the patient–trial pair itself, and the loop is driven by dynamic eligibility states rather than binary screening outcomes (Leach et al., 8 Dec 2025).