LLaMP: Large Language Model Made Powerful for High-fidelity Materials Knowledge Retrieval and Distillation (2401.17244v3)
Abstract: Reducing hallucination of LLMs is imperative for use in the sciences, where reliability and reproducibility are crucial. However, LLMs inherently lack long-term memory, making it a nontrivial, ad hoc, and inevitably biased task to fine-tune them on domain-specific literature and data. Here we introduce LLaMP, a multimodal retrieval-augmented generation (RAG) framework of hierarchical reasoning-and-acting (ReAct) agents that can dynamically and recursively interact with computational and experimental data on Materials Project (MP) and run atomistic simulations via high-throughput workflow interface. Without fine-tuning, LLaMP demonstrates strong tool usage ability to comprehend and integrate various modalities of materials science concepts, fetch relevant data stores on the fly, process higher-order data (such as crystal structure and elastic tensor), and streamline complex tasks in computational materials and chemistry. We propose a simple metric combining uncertainty and confidence estimates to evaluate the self-consistency of responses by LLaMP and vanilla LLMs. Our benchmark shows that LLaMP effectively mitigates the intrinsic bias in LLMs, counteracting the errors on bulk moduli, electronic bandgaps, and formation energies that seem to derive from mixed data sources. We also demonstrate LLaMP's capability to edit crystal structures and run annealing molecular dynamics simulations using pre-trained machine-learning force fields. The framework offers an intuitive and nearly hallucination-free approach to exploring and scaling materials informatics, and establishes a pathway for knowledge distillation and fine-tuning other LLMs. Code and live demo are available at https://github.com/chiang-yuan/llamp
- A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity, November 2023. URL http://arxiv.org/abs/2302.04023. arXiv:2302.04023 [cs].
- A foundation model for atomistic materials chemistry, December 2023. URL http://arxiv.org/abs/2401.00096. arXiv:2401.00096 [cond-mat, physics:physics].
- Autonomous chemical research with large language models. Nature, 624(7992):570–578, December 2023. ISSN 1476-4687. doi: 10.1038/s41586-023-06792-0. URL https://www.nature.com/articles/s41586-023-06792-0. Number: 7992 Publisher: Nature Publishing Group.
- RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control, July 2023. URL http://arxiv.org/abs/2307.15818. arXiv:2307.15818 [cs].
- Harrison Chase. LangChain, October 2022. URL https://github.com/langchain-ai/langchain.
- PaLM: Scaling Language Modeling with Pathways, October 2022. URL http://arxiv.org/abs/2204.02311. arXiv:2204.02311 [cs].
- Structured information extraction from complex scientific text with fine-tuned large language models, December 2022. URL http://arxiv.org/abs/2212.05238. arXiv:2212.05238 [cond-mat].
- Neural Scaling of Deep Chemical Models, May 2022. URL https://chemrxiv.org/engage/chemrxiv/article-details/627bddd544bdd532395fb4b5.
- Robocrystallographer: automated crystal structure text descriptions and analysis. MRS Communications, 9(3):874–881, September 2019. ISSN 2159-6867. doi: 10.1557/mrc.2019.94. URL https://doi.org/10.1557/mrc.2019.94.
- Sinéad M. Griffin. Origin of correlated isolated flat bands in copper-substituted lead phosphate apatite, July 2023. URL http://arxiv.org/abs/2307.16892. arXiv:2307.16892 [cond-mat].
- Fine-Tuned Language Models Generate Stable Inorganic Materials as Text. November 2023. URL https://openreview.net/forum?id=0r5DE2ZSwJ.
- MatSciBERT: A materials domain language model for text mining and information extraction. npj Computational Materials, 8(1):1–11, May 2022. ISSN 2057-3960. doi: 10.1038/s41524-022-00784-w. URL https://www.nature.com/articles/s41524-022-00784-w. Number: 1 Publisher: Nature Publishing Group.
- A high-throughput infrastructure for density functional theory calculations. Computational Materials Science, 50(8):2295–2310, June 2011. ISSN 0927-0256. doi: 10.1016/j.commatsci.2011.02.023. URL https://www.sciencedirect.com/science/article/pii/S0927025611001133.
- Commentary: The Materials Project: A materials genome approach to accelerating materials innovation. APL Materials, 1(1):011002, July 2013. ISSN 2166-532X. doi: 10.1063/1.4812323. URL https://doi.org/10.1063/1.4812323.
- Large Language Models Struggle to Learn Long-Tail Knowledge, July 2023. URL http://arxiv.org/abs/2211.08411. arXiv:2211.08411 [cs].
- Text-mined dataset of inorganic materials synthesis recipes. Scientific Data, 6(1):203, October 2019. ISSN 2052-4463. doi: 10.1038/s41597-019-0224-1. URL https://www.nature.com/articles/s41597-019-0224-1. Number: 1 Publisher: Nature Publishing Group.
- G. Kresse and J. Furthmüller. Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set. Physical Review B, 54(16):11169–11186, October 1996. doi: 10.1103/PhysRevB.54.11169. URL https://link.aps.org/doi/10.1103/PhysRevB.54.11169. Publisher: American Physical Society.
- Effect of Apatite Amendments on Plant Uptake of Lead from Contaminated Soil. Environmental Science & Technology, 31(10):2745–2753, October 1997. ISSN 0013-936X. doi: 10.1021/es961011o. URL https://doi.org/10.1021/es961011o. Publisher: American Chemical Society.
- The First Room-Temperature Ambient-Pressure Superconductor, July 2023a. URL http://arxiv.org/abs/2307.12008. arXiv:2307.12008 [cond-mat].
- Superconductor Pb$_{10-x}$Cu$_x$(PO$_4$)$_6$O showing levitation at room temperature and atmospheric pressure and mechanism, August 2023b. URL http://arxiv.org/abs/2307.12037. arXiv:2307.12037 [cond-mat].
- Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. In Advances in Neural Information Processing Systems, volume 33, pp. 9459–9474. Curran Associates, Inc., 2020. URL https://proceedings.neurips.cc/paper/2020/hash/6b493230205f780e1bc26945df7481e5-Abstract.html.
- Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks, April 2021. URL http://arxiv.org/abs/2005.11401. arXiv:2005.11401 [cs] version: 4.
- True Composition and Structure of Hexagonal “YAlO3”, Actually Y3Al3O8CO3. Inorganic Chemistry, 54(3):837–844, February 2015. ISSN 0020-1669. doi: 10.1021/ic502027k. URL https://doi.org/10.1021/ic502027k. Publisher: American Chemical Society.
- Lost in the Middle: How Language Models Use Long Contexts, November 2023. URL http://arxiv.org/abs/2307.03172. arXiv:2307.03172 [cs].
- When Not to Trust Language Models: Investigating Effectiveness of Parametric and Non-Parametric Memories, July 2023. URL http://arxiv.org/abs/2212.10511. arXiv:2212.10511 [cs].
- Effective diffusion coefficients in solid-state sintering. Acta Materialia, 52(10):2953–2963, June 2004. ISSN 1359-6454. doi: 10.1016/j.actamat.2004.02.042. URL https://www.sciencedirect.com/science/article/pii/S1359645404001326.
- Scaling deep learning for materials discovery. Nature, pp. 1–6, November 2023. ISSN 1476-4687. doi: 10.1038/s41586-023-06735-9. URL https://www.nature.com/articles/s41586-023-06735-9. Publisher: Nature Publishing Group.
- The emergent field of high entropy oxides: Design, prospects, challenges, and opportunities for tailoring material properties. APL Materials, 8(4):040912, April 2020. ISSN 2166-532X. doi: 10.1063/5.0003149. URL https://doi.org/10.1063/5.0003149.
- Python Materials Genomics (pymatgen): A robust, open-source python library for materials analysis. Computational Materials Science, 68:314–319, February 2013. ISSN 0927-0256. doi: 10.1016/j.commatsci.2012.10.028. URL https://www.sciencedirect.com/science/article/pii/S0927025612006295.
- OpenAI. GPT-4 Technical Report, March 2023. URL http://arxiv.org/abs/2303.08774. arXiv:2303.08774 [cs].
- Extracting Accurate Materials Data from Research Papers with Conversational Language Models and Prompt Engineering, June 2023. URL http://arxiv.org/abs/2303.05352. arXiv:2303.05352 [cond-mat].
- Learning Transferable Visual Models From Natural Language Supervision, February 2021. URL http://arxiv.org/abs/2103.00020. arXiv:2103.00020 [cs].
- Pushing the Pareto front of band gap and permittivity: ML-guided search for dielectric materials, January 2024. URL http://arxiv.org/abs/2401.05848. arXiv:2401.05848 [cond-mat, physics:physics].
- WikiChat: Stopping the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on Wikipedia, October 2023. URL http://arxiv.org/abs/2305.14292. arXiv:2305.14292 [cs].
- An autonomous laboratory for the accelerated synthesis of novel materials. Nature, 624(7990):86–91, December 2023. ISSN 1476-4687. doi: 10.1038/s41586-023-06734-w. URL https://www.nature.com/articles/s41586-023-06734-w. Number: 7990 Publisher: Nature Publishing Group.
- Selectivity in Yttrium Manganese Oxide Synthesis via Local Chemical Potentials in Hyperdimensional Phase Space. Journal of the American Chemical Society, 143(37):15185–15194, September 2021. ISSN 0002-7863. doi: 10.1021/jacs.1c06229. URL https://doi.org/10.1021/jacs.1c06229. Publisher: American Chemical Society.
- Llama 2: Open Foundation and Fine-Tuned Chat Models, July 2023. URL http://arxiv.org/abs/2307.09288. arXiv:2307.09288 [cs].
- Unsupervised word embeddings capture latent knowledge from materials science literature. Nature, 571(7763):95–98, July 2019. ISSN 1476-4687. doi: 10.1038/s41586-019-1335-8. URL https://www.nature.com/articles/s41586-019-1335-8. Number: 7763 Publisher: Nature Publishing Group.
- Attention Is All You Need, August 2023. URL http://arxiv.org/abs/1706.03762. arXiv:1706.03762 [cs].
- Augmenting Language Models with Long-Term Memory, June 2023. URL http://arxiv.org/abs/2306.07174. arXiv:2306.07174 [cs].
- Chain-of-Thought Prompting Elicits Reasoning in Large Language Models, January 2023. URL http://arxiv.org/abs/2201.11903. arXiv:2201.11903 [cs].
- Large Language Models as Master Key: Unlocking the Secrets of Materials Science with GPT, April 2023. URL http://arxiv.org/abs/2304.02213. arXiv:2304.02213 [cs].
- Accurate Prediction of Experimental Band Gaps from Large Language Model-Based Data Extraction. November 2023. URL https://openreview.net/forum?id=oRKWhmtUG6.
- ReAct: Synergizing Reasoning and Acting in Language Models, March 2023. URL http://arxiv.org/abs/2210.03629. arXiv:2210.03629 [cs].
- MatterGen: a generative model for inorganic materials design, January 2024. URL http://arxiv.org/abs/2312.03687. arXiv:2312.03687 [cond-mat].
- Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena, December 2023a. URL http://arxiv.org/abs/2306.05685. arXiv:2306.05685 [cs].
- ChatGPT Chemistry Assistant for Text Mining and Prediction of MOF Synthesis. Journal of the American Chemical Society, 145(32):18048–18062, August 2023b. ISSN 0002-7863, 1520-5126. doi: 10.1021/jacs.3c05819. URL http://arxiv.org/abs/2306.11296. arXiv:2306.11296 [cond-mat, physics:physics].
- Yuan Chiang (8 papers)
- Chia-Hong Chou (1 paper)
- Janosh Riebesell (10 papers)
- Elvis Hsieh (3 papers)