Papers
Topics
Authors
Recent
2000 character limit reached

GPT-3 Powered Information Extraction for Building Robust Knowledge Bases (2408.04641v1)

Published 31 Jul 2024 in cs.CL, cs.AI, and cs.LG

Abstract: This work uses the state-of-the-art LLM GPT-3 to offer a novel method of information extraction for knowledge base development. The suggested method attempts to solve the difficulties associated with obtaining relevant entities and relationships from unstructured text in order to extract structured information. We conduct experiments on a huge corpus of text from diverse fields to assess the performance of our suggested technique. The evaluation measures, which are frequently employed in information extraction tasks, include precision, recall, and F1-score. The findings demonstrate that GPT-3 can be used to efficiently and accurately extract pertinent and correct information from text, hence increasing the precision and productivity of knowledge base creation. We also assess how well our suggested approach performs in comparison to the most advanced information extraction techniques already in use. The findings show that by utilizing only a small number of instances in in-context learning, our suggested strategy yields competitive outcomes with notable savings in terms of data annotation and engineering expense. Additionally, we use our proposed method to retrieve Biomedical information, demonstrating its practicality in a real-world setting. All things considered, our suggested method offers a viable way to overcome the difficulties involved in obtaining structured data from unstructured text in order to create knowledge bases. It can greatly increase the precision and effectiveness of information extraction, which is necessary for many applications including chatbots, recommendation engines, and question-answering systems.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (35)
  1. J. Lee, W. Yoon, S. Kim, D. Kim, S. Kim, C. H. So, and J. Kang, “Biobert: a pre-trained biomedical language representation model for biomedical text mining,” Bioinformatics, vol. 36, no. 4, pp. 1234–1240, 2020.
  2. T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell et al., “Language models are few-shot learners,” Advances in neural information processing systems, vol. 33, pp. 1877–1901, 2020.
  3. S. Smith, M. Patwary, B. Norick, P. LeGresley, S. Rajbhandari, J. Casper, Z. Liu, S. Prabhumoye, G. Zerveas, V. Korthikanti et al., “Using deepspeed and megatron to train megatron-turing nlg 530b, a large-scale generative language model,” arXiv preprint arXiv:2201.11990, 2022.
  4. W. Fedus, B. Zoph, and N. Shazeer, “Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity,” J. Mach. Learn. Res, vol. 23, pp. 1–40, 2021.
  5. J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018.
  6. Y. Gu, R. Tinn, H. Cheng, M. Lucas, N. Usuyama, X. Liu, T. Naumann, J. Gao, and H. Poon, “Domain-specific language model pretraining for biomedical natural language processing,” ACM Transactions on Computing for Healthcare (HEALTH), vol. 3, no. 1, pp. 1–23, 2021.
  7. E. Perez, D. Kiela, and K. Cho, “True few-shot learning with language models,” Advances in neural information processing systems, vol. 34, pp. 11 054–11 070, 2021.
  8. E. Agirre, M. Apidianaki, and I. Vulić, “Proceedings of deep learning inside out (deelio 2022): The 3rd workshop on knowledge extraction and integration for deep learning architectures,” in Proceedings of Deep Learning Inside Out (DeeLIO 2022): The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures, 2022.
  9. Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov, “Roberta: A robustly optimized bert pretraining approach,” arXiv preprint arXiv:1907.11692, 2019.
  10. R. Shin, C. H. Lin, S. Thomson, C. Chen, S. Roy, E. A. Platanios, A. Pauls, D. Klein, J. Eisner, and B. Van Durme, “Constrained language models yield few-shot semantic parsers,” arXiv preprint arXiv:2104.08768, 2021.
  11. Z. Zhao, E. Wallace, S. Feng, D. Klein, and S. Singh, “Calibrate before use: Improving few-shot performance of language models,” in International Conference on Machine Learning.   PMLR, 2021, pp. 12 697–12 706.
  12. N. Malkin, Z. Wang, and N. Jojic, “Coherence boosting: When your pretrained language model is not paying enough attention,” in Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022, pp. 8214–8236.
  13. T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Funtowicz et al., “Transformers: State-of-the-art natural language processing,” in Proceedings of the 2020 conference on empirical methods in natural language processing: system demonstrations, 2020, pp. 38–45.
  14. E. V. Epure and R. Hennequin, “Probing pre-trained auto-regressive language models for named entity typing and recognition,” arXiv preprint arXiv:2108.11857, 2021.
  15. T. Schick and H. Schütze, “It’s not just size that matters: Small language models are also few-shot learners,” arXiv preprint arXiv:2009.07118, 2020.
  16. P.-L. H. Cabot and R. Navigli, “Rebel: Relation extraction by end-to-end language generation,” in Findings of the Association for Computational Linguistics: EMNLP 2021, 2021, pp. 2370–2381.
  17. C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li, and P. J. Liu, “Exploring the limits of transfer learning with a unified text-to-text transformer,” The Journal of Machine Learning Research, vol. 21, no. 1, pp. 5485–5551, 2020.
  18. S. Raval, H. Sedghamiz, E. Santus, T. Alhanai, M. Ghassemi, and E. Chersoni, “Exploring a unified sequence-to-sequence transformer for medical product safety monitoring in social media,” arXiv preprint arXiv:2109.05815, 2021.
  19. L. N. Phan, J. T. Anibal, H. Tran, S. Chanana, E. Bahadroglu, A. Peltekian, and G. Altan-Bonnet, “Scifive: a text-to-text transformer model for biomedical literature,” arXiv preprint arXiv:2106.03598, 2021.
  20. M. Parmar, S. Mishra, M. Purohit, M. Luo, M. H. Murad, and C. Baral, “In-boxbart: Get instructions into biomedical multi-task learning,” arXiv preprint arXiv:2204.07600, 2022.
  21. R. L. Logan IV, I. Balažević, E. Wallace, F. Petroni, S. Singh, and S. Riedel, “Cutting down on prompts and parameters: Simple few-shot learning with language models,” arXiv preprint arXiv:2106.13353, 2021.
  22. T. Schick and H. Schütze, “True few-shot learning with prompts—a real-world perspective,” Transactions of the Association for Computational Linguistics, vol. 10, pp. 716–731, 2022.
  23. M. Moradi, K. Blagec, F. Haberl, and M. Samwald, “Gpt-3 models are poor few-shot learners in the biomedical domain,” arXiv preprint arXiv:2109.02555, 2021.
  24. M. Agrawal, S. Hegselmann, H. Lang, Y. Kim, and D. Sontag, “Large language models are zero-shot clinical information extractors,” arXiv preprint arXiv:2205.12689, 2022.
  25. B. J. Gutiérrez, N. McNeal, C. Washington, Y. Chen, L. Li, H. Sun, and Y. Su, “Thinking about gpt-3 in-context learning for biomedical ie? think again,” arXiv preprint arXiv:2203.08410, 2022.
  26. N. Reimers and I. Gurevych, “Sentence-bert: Sentence embeddings using siamese bert-networks,” arXiv preprint arXiv:1908.10084, 2019.
  27. S. Robertson, H. Zaragoza et al., “The probabilistic relevance framework: Bm25 and beyond,” Foundations and Trends® in Information Retrieval, vol. 3, no. 4, pp. 333–389, 2009.
  28. R. I. Doğan, R. Leaman, and Z. Lu, “Ncbi disease corpus: a resource for disease name recognition and concept normalization,” Journal of biomedical informatics, vol. 47, pp. 1–10, 2014.
  29. L. Smith, L. K. Tanabe, C.-J. Kuo, I. Chung, C.-N. Hsu, Y.-S. Lin, R. Klinger, C. M. Friedrich, K. Ganchev, M. Torii et al., “Overview of biocreative ii gene mention recognition,” Genome biology, vol. 9, no. 2, pp. 1–19, 2008.
  30. N. Collier and J.-D. Kim, “Introduction to the bio-entity recognition task at jnlpba,” in Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications (NLPBA/BioNLP), 2004, pp. 73–78.
  31. J. Li, Y. Sun, R. J. Johnson, D. Sciaky, C.-H. Wei, R. Leaman, A. P. Davis, C. J. Mattingly, T. C. Wiegers, and Z. Lu, “Biocreative v cdr task corpus: a resource for chemical disease relation extraction,” Database, vol. 2016, 2016.
  32. M. Krallinger, O. Rabal, S. A. Akhondi, M. P. Pérez, J. Santamaría, G. P. Rodríguez, G. Tsatsaronis, A. Intxaurrondo, J. A. López, U. Nandal et al., “Overview of the biocreative vi chemical-protein interaction track,” in Proceedings of the sixth BioCreative challenge evaluation workshop, vol. 1, no. 2017, 2017, pp. 141–146.
  33. M. Herrero-Zazo, I. Segura-Bedmar, P. Martínez, and T. Declerck, “The ddi corpus: An annotated corpus with pharmacological substances and drug–drug interactions,” Journal of biomedical informatics, vol. 46, no. 5, pp. 914–920, 2013.
  34. À. Bravo, J. Piñero, N. Queralt-Rosinach, M. Rautschka, and L. I. Furlong, “Extraction of relations between genes and diseases from text and large-scale data analysis: implications for translational research,” BMC bioinformatics, vol. 16, pp. 1–17, 2015.
  35. T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Funtowicz et al., “Huggingface’s transformers: State-of-the-art natural language processing,” arXiv preprint arXiv:1910.03771, 2019.

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Paper to Video (Beta)

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 2 tweets with 0 likes about this paper.