Papers
Topics
Authors
Recent
Search
2000 character limit reached

Towards Development of Automated Knowledge Maps and Databases for Materials Engineering using Large Language Models

Published 17 Feb 2024 in cs.DL | (2402.11323v1)

Abstract: In this work a LLM based workflow is presented that utilizes OpenAI ChatGPT model GPT-3.5-turbo-1106 and Google Gemini Pro model to create summary of text, data and images from research articles. It is demonstrated that by using a series of processing, the key information can be arranged in tabular form and knowledge graphs to capture underlying concepts. Our method offers efficiency and comprehension, enabling researchers to extract insights more effectively. Evaluation based on a diverse Scientific Paper Collection demonstrates our approach in facilitating discovery of knowledge. This work contributes to accelerated material design by smart literature review. The method has been tested based on various qualitative and quantitative measures of gathered information. The ChatGPT model achieved an F1 score of 0.40 for an exact match (ROUGE-1, ROUGE-2) but an impressive 0.479 for a relaxed match (ROUGE-L, ROUGE-Lsum) structural data format in performance evaluation. The Google Gemini Pro outperforms ChatGPT with an F1 score of 0.50 for an exact match and 0.63 for a relaxed match. This method facilitates high-throughput development of a database relevant to materials informatics. For demonstration, an example of data extraction and knowledge graph formation based on a manuscript about a titanium alloy is discussed.

Citations (1)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.