Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

AlloyBERT: Alloy Property Prediction with Large Language Models (2403.19783v1)

Published 28 Mar 2024 in cond-mat.mtrl-sci and cs.LG

Abstract: The pursuit of novel alloys tailored to specific requirements poses significant challenges for researchers in the field. This underscores the importance of developing predictive techniques for essential physical properties of alloys based on their chemical composition and processing parameters. This study introduces AlloyBERT, a transformer encoder-based model designed to predict properties such as elastic modulus and yield strength of alloys using textual inputs. Leveraging the pre-trained RoBERTa encoder model as its foundation, AlloyBERT employs self-attention mechanisms to establish meaningful relationships between words, enabling it to interpret human-readable input and predict target alloy properties. By combining a tokenizer trained on our textual data and a RoBERTa encoder pre-trained and fine-tuned for this specific task, we achieved a mean squared error (MSE) of 0.00015 on the Multi Principal Elemental Alloys (MPEA) data set and 0.00611 on the Refractory Alloy Yield Strength (RAYS) dataset. This surpasses the performance of shallow models, which achieved a best-case MSE of 0.00025 and 0.0076 on the MPEA and RAYS datasets respectively. Our results highlight the potential of LLMs in material science and establish a foundational framework for text-based prediction of alloy properties that does not rely on complex underlying representations, calculations, or simulations.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (21)
  1. Giustino, F. Materials modelling using density functional theory: properties and predictions; Oxford University Press, 2014
  2. Kitchin, J. R.; Miller, S. D.; Sholl, D. S. Density functional theory studies of alloys in heterogeneous catalysis. Specialist Periodical Reports: Chemical Modeling: Applications and Theories 2008,
  3. Vasudevan, R.; Pilania, G.; Balachandran, P. V. Machine learning for materials design and discovery. Journal of Applied Physics 2021, 129
  4. Wang, Y.; Li, Z.; Barati Farimani, A. Machine Learning in Molecular Sciences; Springer, 2023; pp 21–66
  5. Christofidellis, D.; Giannone, G.; Born, J.; Winther, O.; Laino, T.; Manica, M. Unifying molecular and textual representations via multi-task language modelling. arXiv preprint arXiv:2301.12586 2023,
  6. Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 2018,
  7. Peng, Y.; Yan, S.; Lu, Z. Transfer learning in biomedical natural language processing: an evaluation of BERT and ELMo on ten benchmarking datasets. arXiv preprint arXiv:1906.05474 2019,
  8. Liu, Y.; Ott, M.; Goyal, N.; Du, J.; Joshi, M.; Chen, D.; Levy, O.; Lewis, M.; Zettlemoyer, L.; Stoyanov, V. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 2019,
  9. Gillioz, A.; Casas, J.; Mugellini, E.; Abou Khaled, O. Overview of the Transformer-based Models for NLP Tasks. 2020 15th Conference on Computer Science and Information Systems (FedCSIS). 2020; pp 179–183
  10. Touvron, H.; Lavril, T.; Izacard, G.; Martinet, X.; Lachaux, M.-A.; Lacroix, T.; Rozière, B.; Goyal, N.; Hambro, E.; Azhar, F.; others Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 2023,
  11. Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A. N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Advances in neural information processing systems 2017, 30
  12. Ock, J.; Magar, R.; Antony, A.; Farimani, A. B. Multimodal Language and Graph Learning of Adsorption Configuration in Catalysis. arXiv preprint arXiv:2401.07408 2024,
  13. Huang, H.; Magar, R.; Xu, C.; Farimani, A. B. Materials Informatics Transformer: A Language Model for Interpretable Materials Properties Prediction. arXiv preprint arXiv:2308.16259 2023,
  14. Shibata, Y.; Kida, T.; Fukamachi, S.; Takeda, M.; Shinohara, A.; Shinohara, T.; Arikawa, S. Byte Pair encoding: A text compression scheme that accelerates pattern matching. 1999,
  15. Huang, H.; Magar, R.; Barati Farimani, A. Pretraining Strategies for Structure Agnostic Material Property Prediction. Journal of Chemical Information and Modeling 2024,
  16. Kudo, T. Subword regularization: Improving neural network translation models with multiple subword candidates. arXiv preprint arXiv:1804.10959 2018,
  17. Loshchilov, I.; Hutter, F. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 2017,
  18. Kingma, D. P.; Ba, J. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 2014,
  19. Montgomery, D. C.; Peck, E. A.; Vining, G. G. Introduction to linear regression analysis; John Wiley & Sons, 2021
  20. Friedman, J. H. Greedy function approximation: a gradient boosting machine. Annals of statistics 2001, 1189–1232
  21. Falaki, A. A.; Gras, R. Attention Visualizer Package: Revealing Word Importance for Deeper Insight into Encoder-Only Transformer Models. arXiv preprint arXiv:2308.14850 2023,
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets