Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Enhancing Biomedical Text Summarization and Question-Answering: On the Utility of Domain-Specific Pre-Training (2307.04412v1)

Published 10 Jul 2023 in cs.CL

Abstract: Biomedical summarization requires large datasets to train for text generation. We show that while transfer learning offers a viable option for addressing this challenge, an in-domain pre-training does not always offer advantages in a BioASQ summarization task. We identify a suitable model architecture and use it to show a benefit of a general-domain pre-training followed by a task-specific fine-tuning in the context of a BioASQ summarization task, leading to a novel three-step fine-tuning approach that works with only a thousand in-domain examples. Our results indicate that a LLM without domain-specific pre-training can have a significant edge in some domain-specific biomedical text generation tasks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (28)
  1. On the opportunities and risks of foundation models (2021).
  2. Attention is all you need, Advances in Neural Information Processing Systems 2017-Decem (2017). URL: http://arxiv.org/abs/1706.03762.
  3. Bert: Pre-training of deep bidirectional transformers for language understanding, 2018. URL: http://arxiv.org/abs/1810.04805. doi:N19-1423.
  4. Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension, Association for Computational Linguistics, 2020, pp. 7871–7880. URL: http://arxiv.org/abs/1910.13461https://www.aclweb.org/anthology/2020.acl-main.703. doi:10.18653/v1/2020.acl-main.703.
  5. Efficient estimation of word representations in vector space, 2013, pp. 1–12. URL: http://arxiv.org/abs/1301.3781.
  6. Deep contextualized word representations, volume 1, 2018. URL: http://arxiv.org/abs/1802.05365. doi:10.18653/v1/n18-1202.
  7. Publicly available clinical bert embeddings (2019). URL: http://arxiv.org/abs/1904.03323.
  8. Biobert: a pre-trained biomedical language representation model for biomedical text mining, Bioinformatics (2020) 36 (2019). URL: http://arxiv.org/abs/1901.08746. doi:10.1093/bioinformatics/btz682.
  9. Domain-specific language model pretraining for biomedical natural language processing (2020). URL: http://arxiv.org/abs/2007.15779.
  10. S. Alrowili, K. Vijay-Shanker, Biom-transformers: Building large biomedical language models with bert, albert and electra, 2021. doi:10.18653/v1/2021.bionlp-1.24.
  11. Large-scale application of named entity recognition to biomedicine and epidemiology, PLOS Digital Health 1 (2022). doi:10.1371/journal.pdig.0000152.
  12. Y. Liu, M. Lapata, Text summarization with pretrained encoders, 2019. URL: http://arxiv.org/abs/1908.08345. doi:10.18653/v1/d19-1387.
  13. Exploring the limits of transfer learning with a unified text-to-text transformer (2019). URL: http://arxiv.org/abs/1910.10683.
  14. Biobart: Pretraining and evaluation of a biomedical generative language model, 2022. doi:10.18653/v1/2022.bionlp-1.9.
  15. Biogpt: generative pre-trained transformer for biomedical text generation and mining, Briefings in bioinformatics 23 (2022). doi:10.1093/bib/bbac409.
  16. Pretrained transformers improve out-of-distribution robustness, 2020. doi:10.18653/v1/2020.acl-main.244.
  17. Gpt: Improving language understanding by generative pre-training, OpenAI (2018).
  18. Language models are unsupervised multitask learners, OpenAI Blog 1 (2019). URL: https://github.com/codelucas/newspaper.
  19. S. J. Pan, Q. Yang, A survey on transfer learning, IEEE Transactions on Knowledge and Data Engineering 22 (2010) 1345–1359. URL: http://ieeexplore.ieee.org/document/5288526/. doi:10.1109/TKDE.2009.191.
  20. Squad: 100,000+ questions for machine comprehension of text, 2016.
  21. Teaching machines to read and comprehend, volume 2015-January, 2015.
  22. C. Y. Lin, Rouge: A package for automatic evaluation of summaries, 2004.
  23. Interpretable multi-step reasoning with knowledge extraction on complex healthcare question answering (2020).
  24. D. Molla-Aliod, C. Jones, Classification betters regression in query-based multi-document summarisation techniques for question answering: Macquarie university at bioasq7b, 2020. URL: http://arxiv.org/abs/1909.00542.
  25. Alter: Auxiliary text rewriting tool for natural language generation, 2019. doi:10.18653/v1/d19-3003.
  26. Scheduled sampling for sequence prediction with recurrent neural networks, volume 2015-January, 2015.
  27. Brio: Bringing order to abstractive summarization, volume 1, 2022. doi:10.18653/v1/2022.acl-long.207.
  28. E. Briakou, M. Carpuat, Can synthetic translations improve bitext quality?, volume 1, 2022. doi:10.18653/v1/2022.acl-long.326.
Citations (2)

Summary

We haven't generated a summary for this paper yet.