Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Bridging Text and Molecule: A Survey on Multimodal Frameworks for Molecule (2403.13830v1)

Published 7 Mar 2024 in q-bio.BM, cs.CL, and cs.LG

Abstract: Artificial intelligence has demonstrated immense potential in scientific research. Within molecular science, it is revolutionizing the traditional computer-aided paradigm, ushering in a new era of deep learning. With recent progress in multimodal learning and natural language processing, an emerging trend has targeted at building multimodal frameworks to jointly model molecules with textual domain knowledge. In this paper, we present the first systematic survey on multimodal frameworks for molecules research. Specifically,we begin with the development of molecular deep learning and point out the necessity to involve textual modality. Next, we focus on recent advances in text-molecule alignment methods, categorizing current models into two groups based on their architectures and listing relevant pre-training tasks. Furthermore, we delves into the utilization of LLMs and prompting techniques for molecular tasks and present significant applications in drug discovery. Finally, we discuss the limitations in this field and highlight several promising directions for future research.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (67)
  1. Prot2text: Multimodal protein’s function generation with gnns and transformers. arXiv:2307.14367, 2023.
  2. The impact of large language models on scientific discovery: a preliminary study using gpt-4. arXiv:2311.07361, 2023.
  3. The swiss-prot protein sequence database and its supplement trembl in 2000. Nucleic acids research, 2000.
  4. Simon Batzner et al. E (3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials. Nature communications, 2022.
  5. SciBERT: A pretrained language model for scientific text. In EMNLP, 2019.
  6. Autonomous chemical research with large language models. Nature, 2023.
  7. Augmenting large language models with chemistry tools. In NeurIPS 2023 AI for Science Workshop, 2023.
  8. Instructmol: Multi-modal integration for building a versatile and reliable molecular assistant in drug discovery. arXiv:2311.16208, 2023.
  9. From artificially real to real: Leveraging pseudo data from large language models for low-resource molecule discovery. In AAAI, 2024.
  10. Unifying molecular and textual representations via multi-task language modelling. In ICML, 2023.
  11. Robust deep learning–based protein sequence design using proteinmpnn. Science, 2022.
  12. Jacob Devlin et al. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805, 2018.
  13. A survey of vision-language pre-trained models. arXiv preprint arXiv:2202.10936, 2022.
  14. Text2mol: Cross-modal molecule retrieval with natural language queries. In EMNLP, 2021.
  15. Translation between molecules and natural language. In EMNLP, 2022.
  16. Mol-instructions: A large-scale biomolecular instruction dataset for large language models. arXiv:2306.08018, 2023.
  17. Tianyu Gao et al. Making pre-trained language models better few-shot learners. In ACL, 2021.
  18. What can large language models do in chemistry? a comprehensive benchmark on eight tasks. arXiv:2305.18365, 2023.
  19. Lora: Low-rank adaptation of large language models. In ICLR, 2022.
  20. Highly accurate protein structure prediction with alphafold. Nature, 2021.
  21. Data-efficient molecular generation with hierarchical textual inversion. In NeurIPS 2023 Workshop on AI4D3, 2023.
  22. Pubchem 2023 update. Nucleic acids research, 2023.
  23. Semi-supervised classification with graph convolutional networks. In ICLR, 2017.
  24. Rlaif: Scaling reinforcement learning from human feedback with ai feedback. arXiv preprint arXiv:2309.00267, 2023.
  25. Junnan Li et al. Blip-2: Bootstrapping language-image pre-training with frozen image encoders and large language models. arXiv:2301.12597, 2023.
  26. Empowering molecule discovery for molecule-caption translation with large language models: A chatgpt perspective. arXiv:2306.06615, 2023.
  27. Towards 3d molecule-text interpretation in language models. In ICLR, 2024.
  28. Drugchat: towards enabling chatgpt-like capabilities on drug molecule graphs. arXiv:2309.03907, 2023.
  29. Shengchao Liu et al. Multi-modal molecule structure–text model for text-based retrieval and editing. Nature Machine Intelligence, 2023.
  30. Zequn Liu et al. MolXPT: Wrapping molecules with text for generative pre-training. In ACL, 2023.
  31. Pre-training molecular graph representation with 3d geometry. In ICLR, 2022.
  32. Git-mol: A multi-modal large language model for molecular science with graph, image, and text. arXiv:2308.06911, 2023.
  33. Chatgpt-powered conversational drug editing using retrieval and domain feedback. In ICML 2023 Workshop on SynS & ML, 2023.
  34. A text-guided protein design framework. arXiv:2302.04611, 2023.
  35. Molca: Molecular graph-language modeling with cross-modal projector and uni-modal adapter. In EMNLP, 2023.
  36. S2orc: The semantic scholar open research corpus. arXiv:1911.02782, 2019.
  37. Biomedgpt: Open multimodal generative pre-trained transformer for biomedicine. arXiv:2308.09442, 2023.
  38. Towards unified ai drug discovery with multiple knowledge modalities. In AAAI, 2024.
  39. Representation learning with contrastive predictive coding. arXiv:1807.03748, 2018.
  40. Training language models to follow instructions with human feedback. In NeurIPS, 2022.
  41. Structured chemistry reasoning with large language models. arXiv:2311.09656, 2023.
  42. Biot5: Enriching cross-modal integration in biology with chemical knowledge and natural language associations. In EMNLP, 2023.
  43. Predictive chemistry augmented with text retrieval. In EMNLP, 2023.
  44. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 2020.
  45. Llm-prop: Predicting physical and electronic properties of crystalline solids from their text descriptions. arXiv:2310.14029, 2023.
  46. Crossing new frontiers: Knowledge-augmented large language model prompting for zero-shot text-based de novo molecule design. In NeurIPS 2023 Workshop on R0-FoMo, 2023.
  47. Enhancing activity prediction models in drug discovery with the ability to understand human language. In ICML, 2023.
  48. Relm: Leveraging language models for enhanced chemical reaction prediction. In EMNLP, 2023.
  49. A molecular multimodal foundation model associating molecule graphs with natural language. arXiv:2209.05481, 2022.
  50. Mollm: A unified language model to integrate biomedical text with 2d and 3d molecular representations. bioRxiv, 2023.
  51. Galactica: A large language model for science. arXiv:2211.09085, 2022.
  52. Attention is all you need. In NeurIPS, 2017.
  53. Instructprotein: Aligning human and protein language via knowledge instruction. arXiv:2310.03269, 2023.
  54. Biobridge: Bridging biomedical foundation models via knowledge graph. In ICLR, 2024.
  55. Finetuned language models are zero-shot learners. In ICLR, 2022.
  56. Chain-of-thought prompting elicits reasoning in large language models. In NeurIPS, 2022.
  57. Extracting human interpretable structure-property relationships in chemistry using XAI and large language models. In NeurIPS 2023 Workshop on XAI in Action, 2023.
  58. Moleculenet: A benchmark for molecular machine learning, 2018.
  59. Protst: Multi-modality learning of protein sequences and biomedical texts. In ICML, 2023.
  60. Generative pre-trained transformer: A comprehensive review on enabling technologies, potential applications, emerging challenges, and future directions, 2023.
  61. Graph contrastive learning with augmentations. In NeurIPS, 2020.
  62. Peter A. Zachares et al. Form follows function: Text-to-text conditional graph generation based on functional requirements. arXiv:2311.00444, 2023.
  63. A deep-learning system bridging molecule structure and biomedical text with comprehension comparable to human professionals. Nature communications, 2022.
  64. MoleculeGPT: Instruction following large language models for molecular property prediction. In NeurIPS 2023 Workshop on AI4D3, 2023.
  65. GIMLET: A unified graph-text model for instruction-based molecule zero-shot learning. In NeurIPS, 2023.
  66. What a scientific language model knows and doesn’t know about chemistry. In NeurIPS 2023 AI for Science Workshop, 2023.
  67. Denny Zhou et al. Least-to-most prompting enables complex reasoning in large language models. In ICLR, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Yi Xiao (49 papers)
  2. Xiangxin Zhou (22 papers)
  3. Qiang Liu (405 papers)
  4. Liang Wang (512 papers)
Citations (3)
X Twitter Logo Streamline Icon: https://streamlinehq.com