Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Construction and Application of Materials Knowledge Graph in Multidisciplinary Materials Science via Large Language Model (2404.03080v3)

Published 3 Apr 2024 in cs.CL and cs.AI

Abstract: Knowledge in materials science is widely dispersed across extensive scientific literature, posing significant challenges for efficient discovery and integration of new materials. Traditional methods, often reliant on costly and time-consuming experimental approaches, further complicate rapid innovation. Addressing these challenges, the integration of artificial intelligence with materials science has opened avenues for accelerating the discovery process, though it also demands precise annotation, data extraction, and traceability of information. To tackle these issues, this article introduces the Materials Knowledge Graph (MKG), which utilizes advanced natural language processing techniques, integrated with LLMs to extract and systematically organize a decade's worth of high-quality research into structured triples, contains 162,605 nodes and 731,772 edges. MKG categorizes information into comprehensive labels such as Name, Formula, and Application, structured around a meticulously designed ontology, thus enhancing data usability and integration. By implementing network-based algorithms, MKG not only facilitates efficient link prediction but also significantly reduces reliance on traditional experimental methods. This structured approach not only streamlines materials research but also lays the groundwork for more sophisticated science knowledge graphs.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (24)
  1. Matkg: An autonomously generated knowledge graph in material science. \JournalTitleScientific Data 11, 217 (2024).
  2. Jain, A. et al. The materials project: A materials genome approach to accelerating materials innovation, apl mater. \JournalTitleAPL materials (2013).
  3. Materials design and discovery with high-throughput density functional theory: the open quantum materials database (oqmd). \JournalTitleJom 65, 1501–1509 (2013).
  4. The nomad laboratory: from data sharing to artificial intelligence. \JournalTitleJournal of Physics: Materials 2, 036001 (2019).
  5. Mrdjenovich, D. et al. Propnet: a knowledge graph for materials science. \JournalTitleMatter 2, 464–480 (2020).
  6. A survey on knowledge graphs: Representation, acquisition, and applications. \JournalTitleIEEE Transactions on Neural Networks and Learning Systems 33, 494–514 (2022).
  7. Neural, symbolic and neural-symbolic reasoning on knowledge graphs. \JournalTitleAI Open 2, 14–35 (2021).
  8. Mitchell, T. et al. Never-ending learning. \JournalTitleCommun. ACM 61, 103–115 (2018).
  9. A comprehensive survey on automatic knowledge graph construction. \JournalTitleACM Comput. Surv. 56 (2023).
  10. Pan, S. et al. Unifying large language models and knowledge graphs: A roadmap. \JournalTitleIEEE Transactions on Knowledge and Data Engineering 1–20 (2024).
  11. Weston, L. et al. Named entity recognition and normalization applied to large-scale information extraction from the materials science literature. \JournalTitleJournal of Chemical Information and Modeling 59, 3692–3702 (2019).
  12. Mmkg: An approach to generate metallic materials knowledge graph based on dbpedia and wikipedia. \JournalTitleComputer Physics Communications 211, 98–112 (2017).
  13. Nie, Z. et al. Automating materials exploration with a semantic knowledge graph for li-ion battery cathodes. \JournalTitleAdvanced Functional Materials 32, 2201437 (2022).
  14. An, Y. et al. Knowledge graph question answering for materials science (kgqa4mat): Developing natural language interface for metal-organic frameworks knowledge graph (mof-kg). \JournalTitlearXiv preprint arXiv:2309.11361 (2023).
  15. MatKG-2: Unveiling precise material science ontology through autonomous committees of LLMs. \JournalTitleAI for Accelerated Materials Design - NeurIPS 2023 Workshop (2023).
  16. Using distant supervision to augment manually annotated data for relation extraction. \JournalTitlePLOS ONE 14 (2019).
  17. Explainable representations for relation prediction in knowledge graphs. \JournalTitlearXiv preprint arXiv:2306.12687 (2023).
  18. Brown, T. et al. Language models are few-shot learners. \JournalTitleAdvances in neural information processing systems 33, 1877–1901 (2020).
  19. Touvron, H. et al. Llama: Open and efficient foundation language models. \JournalTitlearXiv preprint arXiv:2302.13971 (2023).
  20. Xie, T. et al. Creation of a structured solar cell material dataset and performance prediction using large language models. \JournalTitlePatterns (2024).
  21. Dagdelen, J. et al. Structured information extraction from scientific text with large language models. \JournalTitleNature Communications 15, 1418 (2024).
  22. Chemdataextractor: A toolkit for automated extraction of chemical information from the scientific literature. \JournalTitleJournal of Chemical Information and Modeling 56, 1894–1904 (2016).
  23. Tshitoyan, V. et al. Unsupervised word embeddings capture latent knowledge from materials science literature. \JournalTitleNATURE 571, 95+ (2019).
  24. Xie, T. et al. Darwin series: Domain specific large language models for natural science. \JournalTitlearXiv preprint arXiv:2308.13565 (2023).
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Yanpeng Ye (1 paper)
  2. Jie Ren (329 papers)
  3. Shaozhou Wang (5 papers)
  4. Yuwei Wan (9 papers)
  5. Imran Razzak (80 papers)
  6. Tong Xie (18 papers)
  7. Wenjie Zhang (138 papers)
  8. Haofen Wang (32 papers)
  9. Bram Hoex (9 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.