Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Novel Corpus of Annotated Medical Imaging Reports and Information Extraction Results Using BERT-based Language Models (2403.18975v1)

Published 27 Mar 2024 in cs.CL

Abstract: Medical imaging is critical to the diagnosis, surveillance, and treatment of many health conditions, including oncological, neurological, cardiovascular, and musculoskeletal disorders, among others. Radiologists interpret these complex, unstructured images and articulate their assessments through narrative reports that remain largely unstructured. This unstructured narrative must be converted into a structured semantic representation to facilitate secondary applications such as retrospective analyses or clinical decision support. Here, we introduce the Corpus of Annotated Medical Imaging Reports (CAMIR), which includes 609 annotated radiology reports from three imaging modality types: Computed Tomography, Magnetic Resonance Imaging, and Positron Emission Tomography-Computed Tomography. Reports were annotated using an event-based schema that captures clinical indications, lesions, and medical problems. Each event consists of a trigger and multiple arguments, and a majority of the argument types, including anatomy, normalize the spans to pre-defined concepts to facilitate secondary use. CAMIR uniquely combines a granular event structure and concept normalization. To extract CAMIR events, we explored two BERT (Bi-directional Encoder Representation from Transformers)-based architectures, including an existing architecture (mSpERT) that jointly extracts all event information and a multi-step approach (PL-Marker++) that we augmented for the CAMIR schema.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (42)
  1. Leveraging GPT-4 for post hoc transformation of free-text radiology reports into structured reporting: a multilingual feasibility study. Radiology, 307(4):e230725.
  2. An empirical investigation of statistical significance in NLP. In Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pages 995–1005, Jeju Island, Korea. Association for Computational Linguistics.
  3. Use of machine learning to identify follow-up recommendations in radiology reports. Journal of the American College of Radiology, 16(3):336–343.
  4. A systematic review of natural language processing applied to radiology reports. BMC Medical Informatics and Decision Making, 21(1):179.
  5. Radlex normalization in radiology reports. In AMIA Annual Symposium Proceedings, volume 2020, page 338. American Medical Informatics Association.
  6. Surabhi Datta and Kirk Roberts. 2022. Fine-grained spatial information extraction in radiology as two-turn question answering. International Journal of Medical Informatics, 158:104628.
  7. Understanding spatial language in radiology: Representation framework, annotation, and spatial relation extraction from chest x-ray reports using deep learning. Journal of Biomedical Informatics, 108:103473.
  8. What can natural language processing do for clinical decision support? Journal of Biomedical Informatics, 42(5):760–772.
  9. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4171–4186.
  10. Patterns of metastatic disease in patients with cancer derived from natural language processing of structured CT radiology reports over a 10-year period. Radiology, 301(1):115–122.
  11. Use of natural language processing (NLP) in evaluation of radiology reports: An update on applications and technology advances. Seminars in Ultrasound, CT and MRI, 43(2):176–181.
  12. Markus Eberts and Adrian Ulges. 2020. Spanbased joint entity and relation extraction with transformer pre-training. In European Conference on Artifical Intelligence, pages 2006–2013.
  13. Ross W Filice. 2019. Deep-learning language-modeling approach for automated, personalized, and iterative radiology-pathology correlation. J Am Coll Radiol, 16(9):1286–1291.
  14. Potential of ChatGPT and GPT-4 for data mining of free-text CT reports on lung cancer. Radiology, 308(3):e231362.
  15. Intelligent image retrieval based on radiology reports. European Radiology, 22(12):2750–2758.
  16. Use of computerized surveillance to detect nosocomial pneumonia in neonatal intensive care unit patients. Am J Infect Control, 33(8):439–443.
  17. Saeed Hassanpour and Curtis P. Langlotz. 2016. Information extraction from multi-institutional radiology reports. Artificial Intelligence in Medicine, 66:29–39.
  18. CheXpert: A large chest radiograph dataset with uncertainty labels and expert comparison. In Proceedings of the AAAI conference on Artificial Intelligence, volume 33, pages 590–597.
  19. RadGraph: Extracting clinical entities and relations from radiology reports. In Neural Information Processing Systems.
  20. Event-based clinical finding extraction from radiology reports with pre-trained language model. Journal of Digital Imaging, pages 1–14.
  21. Transferability of neural network clinical deidentification systems. Journal of the American Medical Informatics Association, 28(12):2661–2669.
  22. Natural language processing in radiology: update on clinical applications. Journal of the American College of Radiology.
  23. Extracting radiological findings with normalized anatomical information using a span-based BERT relation extraction model. In AMIA Informatics Summit.
  24. Leveraging natural language processing to augment structured social determinants of health data in the electronic health record. Journal of the American Medical Informatics Association.
  25. Automated tracking of follow-up imaging recommendations. American Journal of Roentgenology, 212(6):1287–1294.
  26. Feasibility of using the privacy-preserving large language model vicuna for labeling radiology reports. Radiology, 309(1):e231147.
  27. Bert-based transfer learning in sentence-level anatomic classification of free-text radiology reports. Radiology: Artificial Intelligence, 5(2):e220097.
  28. Natural language processing in radiology: a systematic review. Radiology, 279(2):329–343.
  29. Text simplification using consumer health vocabulary to generate patient-centered radiology reporting: Translation and evaluation. J Med Internet Res, 19(12):e417.
  30. Extracting actionable findings of appendicitis from radiology reports using natural language processing. AMIA Summits on Translational Science, 2013:221.
  31. Daniel L Rubin and Charles E Kahn Jr. 2017. Common data elements in radiology. Radiology, 283(3):837–844.
  32. Automatic fullycontextualized recommendation extraction from radiology reports. Journal of Digital Imaging, 34:374–384.
  33. BRAT: a web-based tool for nlp-assisted text annotation. In Proceedings of the Demonstrations at the Conference of the European Chapter of the Association for Computational Linguistics, pages 102–107.
  34. Extracting clinical terms from radiology reports with deep learning. Journal of Biomedical Informatics, 116:103729.
  35. Interactive NLP in Clinical Care: Identifying Incidental Findings in Radiology Reports. Applied Clinical Informatics, 10(4):655–669.
  36. Natural language processing of radiology reports for identification of skeletal site-specific fractures. BMC Medical Informatics and Decision Making, 19:23–29.
  37. Natural language processing of radiology text reports: Interactive text classification. Radiology: Artificial Intelligence, page e210035.
  38. Preparing medical imaging data for machine learning. Radiology, 295(1):4–15.
  39. Packed levitated marker for entity and relation extraction. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, pages 4904–4917, Dublin, Ireland.
  40. Tumor information extraction in radiology reports for hepatocellular carcinoma patients. AMIA Summits on Translational Science, 2016:455.
  41. Natural language–based machine learning models for the annotation of clinical radiology reports. Radiology, 287(2):570–580.
  42. Clinical concept extraction with contextual word embedding. arXiv preprint arXiv:1810.10566.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Namu Park (3 papers)
  2. Kevin Lybarger (19 papers)
  3. Giridhar Kaushik Ramachandran (8 papers)
  4. Spencer Lewis (1 paper)
  5. Aashka Damani (2 papers)
  6. Martin Gunn (3 papers)
  7. Meliha Yetisgen (31 papers)
  8. Ozlem Uzuner (26 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.