Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Large Language Models are In-Context Molecule Learners (2403.04197v2)

Published 7 Mar 2024 in cs.CL and cs.AI

Abstract: LLMs have demonstrated exceptional performance in biochemical tasks, especially the molecule caption translation task, which aims to bridge the gap between molecules and natural language texts. However, previous methods in adapting LLMs to the molecule-caption translation task required extra domain-specific pre-training stages, suffered weak alignment between molecular and textual spaces, or imposed stringent demands on the scale of LLMs. To resolve the challenges, we propose In-Context Molecule Adaptation (ICMA), as a new paradigm allowing LLMs to learn the molecule-text alignment from context examples via In-Context Molecule Tuning. Specifically, ICMA incorporates the following three stages: Hybrid Context Retrieval, Post-retrieval Re-ranking, and In-context Molecule Tuning. Initially, Hybrid Context Retrieval utilizes BM25 Caption Retrieval and Molecule Graph Retrieval to retrieve informative context examples. Additionally, we also propose Post-retrieval Re-ranking with Sequence Reversal and Random Walk to further improve the quality of retrieval results. Finally, In-Context Molecule Tuning unlocks the in-context molecule learning capability of LLMs with retrieved examples and adapts the parameters of LLMs for the molecule-caption translation task. Experimental results demonstrate that ICMT can empower LLMs to achieve state-of-the-art or comparable performance without extra training corpora and intricate structures, showing that LLMs are inherently in-context molecule learners.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (24)
  1. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
  2. Palm: Scaling language modeling with pathways. Journal of Machine Learning Research, 24(240):1–113.
  3. Selective photoredox trifluoromethylation of tryptophan-containing peptides. European Journal of Organic Chemistry, 2019(46):7596–7605.
  4. A survey for in-context learning. arXiv preprint arXiv:2301.00234.
  5. Translation between molecules and natural language. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 375–413, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  6. Conflicts, villains, resolutions: Towards models of narrative media framing. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 8712–8732, Toronto, Canada. Association for Computational Linguistics.
  7. Material design for next-generation mrna vaccines using lipid nanoparticles. Polymer Reviews, 63(2):394–436.
  8. Lora: Low-rank adaptation of large language models. In International Conference on Learning Representations.
  9. Mistral 7b. arXiv preprint arXiv:2310.06825.
  10. Pubchem 2019 update: improved access to chemical data. Nucleic acids research, 47(D1):D1102–D1109.
  11. Self-referencing embedded strings (selfies): A 100% robust molecular string representation. Machine Learning: Science and Technology, 1(4):045024.
  12. Empowering molecule discovery for molecule-caption translation with large language models: A chatgpt perspective. arXiv preprint arXiv:2306.06615.
  13. Molxpt: Wrapping molecules with text for generative pre-training. arXiv preprint arXiv:2305.10688.
  14. Molca: Molecular graph-language modeling with cross-modal projector and uni-modal adapter. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 15623–15638.
  15. Biot5: Enriching cross-modal integration in biology with chemical knowledge and natural language associations. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 1102–1123.
  16. Nils Reimers and Iryna Gurevych. 2019. Sentence-bert: Sentence embeddings using siamese bert-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3982–3992.
  17. A molecular multimodal foundation model associating molecule graphs with natural language. arXiv preprint arXiv:2209.05481.
  18. Galactica: A large language model for science. arXiv preprint arXiv:2211.09085.
  19. Molecular farming in plants: host systems and expression technology. TRENDS in Biotechnology, 21(12):570–578.
  20. Self-instruct: Aligning language model with self generated instructions. arXiv preprint arXiv:2212.10560.
  21. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35:24824–24837.
  22. David Weininger. 1988. Smiles, a chemical language and information system. 1. introduction to methodology and encoding rules. Journal of chemical information and computer sciences, 28(1):31–36.
  23. Mole-bert: Rethinking pre-training graph neural networks for molecules. In The Eleventh International Conference on Learning Representations.
  24. A systematic survey of chemical pre-trained models. In Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pages 6787–6795.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Jiatong Li (47 papers)
  2. Wei Liu (1135 papers)
  3. Zhihao Ding (9 papers)
  4. Wenqi Fan (78 papers)
  5. Yuqiang Li (45 papers)
  6. Qing Li (430 papers)
Citations (2)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com