KU-DMIS at EHRSQL 2024:Generating SQL query via question templatization in EHR (2406.00014v2)
Abstract: Transforming natural language questions into SQL queries is crucial for precise data retrieval from electronic health record (EHR) databases. A significant challenge in this process is detecting and rejecting unanswerable questions that request information beyond the database's scope or exceed the system's capabilities. In this paper, we introduce a novel text-to-SQL framework that robustly handles out-of-domain questions and verifies the generated queries with query execution.Our framework begins by standardizing the structure of questions into a templated format. We use a powerful LLM, fine-tuned GPT-3.5 with detailed prompts involving the table schemas of the EHR database system. Our experimental results demonstrate the effectiveness of our framework on the EHRSQL-2024 benchmark benchmark, a shared task in the ClinicalNLP workshop. Although a straightforward fine-tuning of GPT shows promising results on the development set, it struggled with the out-of-domain questions in the test set. With our framework, we improve our system's adaptability and achieve competitive performances in the official leaderboard of the EHRSQL-2024 challenge.
- Hajung Kim (5 papers)
- Chanhwi Kim (4 papers)
- Hoonick Lee (2 papers)
- Kyochul Jang (2 papers)
- Jiwoo Lee (12 papers)
- Kyungjae Lee (37 papers)
- Gangwoo Kim (10 papers)
- Jaewoo Kang (83 papers)