Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

RadGraph: Extracting Clinical Entities and Relations from Radiology Reports (2106.14463v3)

Published 28 Jun 2021 in cs.CL, cs.AI, cs.IR, and cs.LG

Abstract: Extracting structured clinical information from free-text radiology reports can enable the use of radiology report information for a variety of critical healthcare applications. In our work, we present RadGraph, a dataset of entities and relations in full-text chest X-ray radiology reports based on a novel information extraction schema we designed to structure radiology reports. We release a development dataset, which contains board-certified radiologist annotations for 500 radiology reports from the MIMIC-CXR dataset (14,579 entities and 10,889 relations), and a test dataset, which contains two independent sets of board-certified radiologist annotations for 100 radiology reports split equally across the MIMIC-CXR and CheXpert datasets. Using these datasets, we train and test a deep learning model, RadGraph Benchmark, that achieves a micro F1 of 0.82 and 0.73 on relation extraction on the MIMIC-CXR and CheXpert test sets respectively. Additionally, we release an inference dataset, which contains annotations automatically generated by RadGraph Benchmark across 220,763 MIMIC-CXR reports (around 6 million entities and 4 million relations) and 500 CheXpert reports (13,783 entities and 9,908 relations) with mappings to associated chest radiographs. Our freely available dataset can facilitate a wide range of research in medical natural language processing, as well as computer vision and multi-modal learning when linked to chest radiographs.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (12)
  1. Saahil Jain (6 papers)
  2. Ashwin Agrawal (8 papers)
  3. Adriel Saporta (5 papers)
  4. Steven QH Truong (4 papers)
  5. Du Nguyen Duong (1 paper)
  6. Tan Bui (4 papers)
  7. Pierre Chambon (7 papers)
  8. Yuhao Zhang (107 papers)
  9. Matthew P. Lungren (43 papers)
  10. Andrew Y. Ng (55 papers)
  11. Curtis P. Langlotz (23 papers)
  12. Pranav Rajpurkar (69 papers)
Citations (171)

Summary

  • The paper introduces a novel dataset and extraction schema that maps clinical entities and relations in radiology reports.
  • It employs a refined schema with four entity types and three relation categories, validated by board-certified radiologists and deep learning models achieving micro F1 scores of 0.82/0.73.
  • Its structured annotations promise significant impact on AI-driven diagnostic systems, clinical documentation, and multi-modal learning in healthcare.

Extracting Clinical Entities and Relations from Radiology Reports with RadGraph

The paper "RadGraph: Extracting Clinical Entities and Relations from Radiology Reports" introduces a novel dataset and information extraction schema aimed at structuring information from radiology reports, specifically from chest X-rays. The authors present RadGraph as a comprehensive dataset that not only includes detailed annotations of clinical entities and their interrelations but also provides a benchmark for evaluating information extraction performance in the medical domain. The paper addresses the challenge of extracting meaningful data from the free-text format of radiology reports, which is pivotal for various applications in healthcare, such as enhancing medical imaging models and facilitating disease surveillance.

Innovations in Data and Schema

The authors develop RadGraph to overcome specific limitations in previous datasets and schemas. The RadGraph schema defines four types of entities: Anatomy, Observation: Definitely Present, Observation: Uncertain, and Observation: Definitely Absent. Relations among these entities are categorized into three types: Suggestive Of, Located At, and Modify, delivering a balance of simplicity and clinically relevant information coverage. This schema is refined through feedback from board-certified radiologists, ensuring it is both clinically meaningful and annotation-efficient.

The dataset itself is meticulously annotated, and the results are gathered from collaborations between Stanford University, VinBrain, and other institutions. It includes a development dataset derived from the MIMIC-CXR dataset, test datasets from both MIMIC-CXR and CheXpert datasets, and a large inference dataset comprising radiology reports with automatic annotations from the RadGraph Benchmark model.

Performance and Benchmarking

The RadGraph Benchmark model, utilizing advanced deep learning techniques, notably achieves a micro F1 score of 0.82/0.73 for relation extraction tasks on the MIMIC-CXR and CheXpert datasets, respectively. These results affirm the model's capability in extracting clinically significant relations and demonstrate its potential applicability across different institutional datasets. Additionally, inter-radiologist agreement measurements validate the robustness of the schema and annotations, with Cohen's Kappa indicating high agreement, particularly on the MIMIC-CXR subset.

Implications and Future Directions

RadGraph holds substantial promise for advancing NLP applications in the clinical domain, particularly within radiology. The development of such a dataset can propel future research in multi-modal learning, linking structured report data with imaging. This integration promises enhancements in AI-driven diagnostic systems and the automation of clinical documentation processes.

Moreover, the release of the RadGraph dataset under a flexible license supports wide-ranging academic use, potentially catalyzing innovation in medical NLP and computer vision tasks. As AI systems integrate more seamlessly into healthcare, datasets like RadGraph will be instrumental in refining model accuracy and expanding their operational scope.

Ethical Considerations and Limitations

The incorporation of ethical considerations—such as de-identifying patient records following HIPAA standards and acknowledging potential demographic biases—is notable. These practices underscore the researchers' awareness of privacy and equity in AI development. However, limitations persist, notably the schema's focus on chest X-rays, potentially limiting generalizability across diverse radiological contexts and patient demographics. Moreover, future iterations may explore augmenting the schema to incorporate richer contextual elements present in the clinical narrative.

In conclusion, RadGraph significantly contributes to structuring radiological text data and sets a robust foundation for subsequent developments in automated information extraction and multi-modal learning in healthcare. This work represents a crucial step in bridging the gap between unstructured clinical notes and structured data amenable to AI analysis, facilitating more effective and scalable healthcare solutions.

Youtube Logo Streamline Icon: https://streamlinehq.com