Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Preference Optimization for Molecular Language Models (2310.12304v1)

Published 18 Oct 2023 in stat.ML, cs.AI, and cs.LG

Abstract: Molecular LLMing is an effective approach to generating novel chemical structures. However, these models do not \emph{a priori} encode certain preferences a chemist may desire. We investigate the use of fine-tuning using Direct Preference Optimization to better align generated molecules with chemist preferences. Our findings suggest that this approach is simple, efficient, and highly effective.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (12)
  1. Andrej Karpathy. minGPT. https://github.com/karpathy/minGPT, 2023.
  2. Self-referencing embedded strings (SELFIES): A 100% robust molecular string representation. Machine Learning: Science and Technology, 1(4):045024, oct 2020.
  3. EGFR signaling pathway as therapeutic target in human cancers. Seminars in Cancer Biology, 85:253–275, 2022.
  4. ChEMBL: towards direct deposition of bioassay data. Nucleic Acids Research, 47(D1):D930–D940, 11 2018.
  5. Automatic differentiation in pytorch. 2017.
  6. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830, 2011.
  7. Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models. Frontiers in Pharmacology, 2020.
  8. Improving language understanding by generative pre-training, 2018.
  9. Direct preference optimization: Your language model is secretly a reward model, 2023.
  10. Generating focussed molecule libraries for drug discovery with recurrent neural networks. CoRR, abs/1701.01329, 2017.
  11. David Weininger. SMILES, a chemical language and information system. Journal of Chemical Information and Computer Sciences, 28(1):31–36, 02 1988.
  12. ChemTS: an efficient python library for de novo molecular generation. Science and Technology of Advanced Materials, 18(1):972–976, nov 2017.
Citations (6)

Summary

We haven't generated a summary for this paper yet.