Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Mining experimental data from Materials Science literature with Large Language Models: an evaluation study (2401.11052v3)

Published 19 Jan 2024 in cs.CL

Abstract: This study is dedicated to assessing the capabilities of LLMs such as GPT-3.5-Turbo, GPT-4, and GPT-4-Turbo in extracting structured information from scientific documents in materials science. To this end, we primarily focus on two critical tasks of information extraction: (i) a named entity recognition (NER) of studied materials and physical properties and (ii) a relation extraction (RE) between these entities. Due to the evident lack of datasets within Materials Informatics (MI), we evaluated using SuperMat, based on superconductor research, and MeasEval, a generic measurement evaluation corpus. The performance of LLMs in executing these tasks is benchmarked against traditional models based on the BERT architecture and rule-based approaches (baseline). We introduce a novel methodology for the comparative analysis of intricate material expressions, emphasising the standardisation of chemical formulas to tackle the complexities inherent in materials science information assessment. For NER, LLMs fail to outperform the baseline with zero-shot prompting and exhibit only limited improvement with few-shot prompting. However, a GPT-3.5-Turbo fine-tuned with the appropriate strategy for RE outperforms all models, including the baseline. Without any fine-tuning, GPT-4 and GPT-4-Turbo display remarkable reasoning and relationship extraction capabilities after being provided with merely a couple of examples, surpassing the baseline. Overall, the results suggest that although LLMs demonstrate relevant reasoning skills in connecting concepts, specialised models are currently a better choice for tasks requiring extracting complex domain-specific entities like materials. These insights provide initial guidance applicable to other materials science sub-domains in future work.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (32)
  1. Small data machine learning in materials science. npj Computational Materials, 9(1):42, March 2023.
  2. Data-driven design of metal–organic frameworks for wet flue gas co2 capture. Nature, 576:253–256, 2019.
  3. Machine learning–enabled high-entropy alloy discovery. Science, 378:78–85, 2022.
  4. An open experimental database for exploring inorganic materials. Scientific Data, 5, 2018.
  5. A polymer dataset for accelerated property prediction and design. Scientific Data, 3, 2016.
  6. Accelerating materials discovery using artificial intelligence, high performance computing and robotics. npj Computational Materials, 8(1):84, 2022.
  7. Machine learning and data mining in materials science, 2020.
  8. Advances in scientific literature mining for interpreting materials characterization. Machine Learning: Science and Technology, 2(4):045007, 2021.
  9. Big data mining and classification of intelligent material science data using machine learning. Applied Sciences, 11(18), 2021.
  10. Data augmentation in microscopic images for material data mining. npj Computational Materials, 6(1):125, 2020.
  11. Ivan A Parinov. Microstructure and properties of high-temperature superconductors. Springer Science & Business Media, 2013.
  12. Exploration of new superconductors and functional materials, and fabrication of superconducting tapes and wires of iron pnictides. Science and Technology of Advanced Materials, 2015.
  13. Electron doping of the iron-arsenide superconductor cefeaso controlled by hydrostatic pressure. Physical Review Letters, 125(20):207001, 2020.
  14. Theory of superconductivity. Physical review, 108(5):1175, 1957.
  15. One small step for generative ai, one giant leap for agi: A complete survey on chatgpt in aigc era. arXiv preprint arXiv:2304.06488, 2023.
  16. Tree of thoughts: Deliberate problem solving with large language models. arXiv preprint arXiv:2305.10601, 2023.
  17. On the planning abilities of large language models–a critical investigation. arXiv preprint arXiv:2305.15771, 2023.
  18. Pearl: Prompting large language models to plan and execute actions over long documents. arXiv preprint arXiv:2305.14564, 2023.
  19. OpenAI. Models, 2024.
  20. ChatGPT: Jack of all trades, master of none. Information Fusion, 99:101861, nov 2023.
  21. Large language model is not a good few-shot information extractor, but a good reranker for hard samples! 2023.
  22. Yes but.. can chatgpt identify entities in historical documents? arXiv preprint arXiv:2303.17322, 2023.
  23. Gpt-3 models are poor few-shot learners in the biomedical domain, 2022.
  24. Prompt engineering of gpt-4 for chemical research: what can/cannot be done? Science and Technology of Advanced Materials: Methods, 3(1):2260300, 2023.
  25. Using gpt-4 in parameter selection of polymer informatics: improving predictive accuracy amidst data scarcity and ‘ugly duckling’dilemma. Digital Discovery, 2(5):1548–1557, 2023.
  26. Automatic extraction of materials and properties from superconductors scientific literature. Science and Technology of Advanced Materials Methods, 3, 2023.
  27. Automatic identification and normalisation of physical measurements in scientific literature. In Proceedings of the ACM Symposium on Document Engineering 2019, DocEng ’19, New York, NY, USA, 2019. Association for Computing Machinery.
  28. N. Reimers and I. Gurevych. sentence-bert: sentence embeddings using siamese bert-networks. 2019.
  29. SemEval-2021 task 8: MeasEval – extracting counts and measurements and their related contexts. In Alexis Palmer, Nathan Schneider, Natalie Schluter, Guy Emerson, Aurelie Herbelot, and Xiaodan Zhu, editors, Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021), pages 306–316, Online, August 2021. Association for Computational Linguistics.
  30. Supermat: construction of a linked annotated dataset from superconductors-related publications. Science and Technology of Advanced Materials Methods, 1:34–44, 2021.
  31. SciBERT: A pretrained language model for scientific text. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3615–3620, Hong Kong, China, November 2019. Association for Computational Linguistics.
  32. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085, 2022.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Luca Foppiano (4 papers)
  2. Guillaume Lambard (5 papers)
  3. Toshiyuki Amagasa (6 papers)
  4. Masashi Ishii (5 papers)
Citations (3)
Github Logo Streamline Icon: https://streamlinehq.com
X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets