Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
91 tokens/sec
Gemini 2.5 Pro Premium
52 tokens/sec
GPT-5 Medium
24 tokens/sec
GPT-5 High Premium
28 tokens/sec
GPT-4o
85 tokens/sec
DeepSeek R1 via Azure Premium
87 tokens/sec
GPT OSS 120B via Groq Premium
478 tokens/sec
Kimi K2 via Groq Premium
221 tokens/sec
2000 character limit reached

Cohort Retrieval using Dense Passage Retrieval (2507.01049v1)

Published 26 Jun 2025 in cs.IR and cs.CL

Abstract: Patient cohort retrieval is a pivotal task in medical research and clinical practice, enabling the identification of specific patient groups from extensive electronic health records (EHRs). In this work, we address the challenge of cohort retrieval in the echocardiography domain by applying Dense Passage Retrieval (DPR), a prominent methodology in semantic search. We propose a systematic approach to transform an echocardiographic EHR dataset of unstructured nature into a Query-Passage dataset, framing the problem as a Cohort Retrieval task. Additionally, we design and implement evaluation metrics inspired by real-world clinical scenarios to rigorously test the models across diverse retrieval tasks. Furthermore, we present a custom-trained DPR embedding model that demonstrates superior performance compared to traditional and off-the-shelf SOTA methods.To our knowledge, this is the first work to apply DPR for patient cohort retrieval in the echocardiography domain, establishing a framework that can be adapted to other medical domains.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Authors (1)