Multi-Objective Latent Space Optimization of Generative Molecular Design Models (2203.00526v3)
Abstract: Molecular design based on generative models, such as variational autoencoders (VAEs), has become increasingly popular in recent years due to its efficiency for exploring high-dimensional molecular space to identify molecules with desired properties. While the efficacy of the initial model strongly depends on the training data, the sampling efficiency of the model for suggesting novel molecules with enhanced properties can be further enhanced via latent space optimization. In this paper, we propose a multi-objective latent space optimization (LSO) method that can significantly enhance the performance of generative molecular design (GMD). The proposed method adopts an iterative weighted retraining approach, where the respective weights of the molecules in the training data are determined by their Pareto efficiency. We demonstrate that our multi-objective GMD LSO method can significantly improve the performance of GMD for jointly optimizing multiple molecular properties.
- Muratov, E. N. et al. Qsar without borders. \JournalTitleChem. Soc. Rev. 49, 3525–3564, DOI: 10.1039/D0CS00098A (2020).
- Kerns, E. H. High throughput physicochemical profiling for drug discovery. \JournalTitleJournal of Pharmaceutical Sciences 90, 1838–1858, DOI: https://doi.org/10.1002/jps.1134 (2001).
- Woo, H.-M. et al. Optimal decision making in high-throughput virtual screening pipelines. \JournalTitlearXiv preprint arXiv:2109.11683 (2021).
- Optimal high-throughput virtual screening pipeline for efficient selection of redox-active organic materials. \JournalTitleiScience 105735 (2022).
- Gómez-Bombarelli, R. et al. Automatic chemical design using a data-driven continuous representation of molecules. \JournalTitleACS central science 4, 268—276, DOI: 10.1021/acscentsci.7b00572 (2018).
- Winter, R. et al. Efficient multi-objective molecular optimization in a continuous latent space. \JournalTitleChem. Sci. 10, 8016–8024, DOI: 10.1039/C9SC01928F (2019).
- Constrained bayesian optimization for automatic chemical design using variational autoencoders. \JournalTitleChem. Sci. 11, 577–586, DOI: 10.1039/C9SC04026A (2020).
- Junction tree variational autoencoder for molecular graph generation. In Dy, J. G. & Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 10-15, 2018, vol. 80 of Proceedings of Machine Learning Research, 2328–2337 (PMLR, 2018).
- Conditional molecular design with deep generative models. \JournalTitleJournal of Chemical Information and Modeling 59, 43–52, DOI: 10.1021/acs.jcim.8b00263 (2019).
- Molecular de-novo design through deep reinforcement learning. \JournalTitleJournal of Cheminformatics 9, DOI: 10.1186/s13321-017-0235-x (2017).
- Shi, C. et al. Graphaf: a flow-based autoregressive model for molecular graph generation (2020). 2001.09382.
- Optimizing distributions over molecular space. an objective-reinforced generative adversarial network for inverse-design chemistry (organic), DOI: 10.26434/chemrxiv.5309668.v2 (2017).
- Optimization of molecules via deep reinforcement learning. \JournalTitleScientific Reports 9, 10752, DOI: 10.1038/s41598-019-47148-x (2019).
- Mnih, V. et al. Human-level control through deep reinforcement learning. \JournalTitleNature 518, 529–533, DOI: 10.1038/nature14236 (2015).
- Improving molecular design by stochastic iterative target augmentation. In III, H. D. & Singh, A. (eds.) Proceedings of the 37th International Conference on Machine Learning, vol. 119 of Proceedings of Machine Learning Research, 10716–10726 (PMLR, 2020).
- A chance-constrained generative framework for sequence optimization. In III, H. D. & Singh, A. (eds.) Proceedings of the 37th International Conference on Machine Learning, vol. 119 of Proceedings of Machine Learning Research, 6271–6281 (PMLR, 2020).
- Sample-efficient optimization in the latent space of deep generative models via weighted retraining (2020). 2006.09191.
- Kim, Y. et al. Deep learning framework for material design space exploration using active transfer learning and data augmentation. \JournalTitlenpj Computational Materials 7, 140, DOI: 10.1038/s41524-021-00609-2 (2021).
- Multi-objective molecule generation using interpretable substructures. In III, H. D. & Singh, A. (eds.) Proceedings of the 37th International Conference on Machine Learning, vol. 119 of Proceedings of Machine Learning Research, 4849–4859 (PMLR, 2020).
- Xie, Y. et al. Mars: Markov molecular sampling for multi-objective drug discovery. In International Conference on Learning Representations (2021).
- Frazier, P. I. A tutorial on bayesian optimization. \JournalTitlearXiv preprint arXiv:1807.02811 (2018).
- Natural product-likeness score and its application for prioritization of compound libraries. \JournalTitleJournal of Chemical Information and Modeling 48, 68–74, DOI: 10.1021/ci700286x (2007).
- Dopamine receptor binding predicts clinical and pharmacological potencies of antischizophrenic drugs. \JournalTitleScience 192, 481–483 (1976).
- Landrum, G. Open-source cheminformatics software.
- Zinc: A free tool to discover chemistry for biology. \JournalTitleJournal of Chemical Information and Modeling 52, 1757–1768, DOI: 10.1021/ci3001277 (2012).
- Swissadme: a free web tool to evaluate pharmacokinetics, drug-likeness and medicinal chemistry friendliness of small molecules. \JournalTitleScientific Reports 7, 42717, DOI: 10.1038/srep42717 (2017).
- Jones, D. et al. Improved protein–ligand binding affinity prediction with structure-based deep fusion inference. \JournalTitleJournal of Chemical Information and Modeling 61, 1583–1592, DOI: 10.1021/acs.jcim.0c01306 (2021). PMID: 33754707, https://doi.org/10.1021/acs.jcim.0c01306.
- McLoughlin, K. S. et al. Machine learning models to predict inhibition of the bile salt export pump. \JournalTitleJournal of Chemical Information and Modeling 61, 587–602, DOI: 10.1021/acs.jcim.0c00950 (2021). PMID: 33502191, https://doi.org/10.1021/acs.jcim.0c00950.
- Jacobs, S. A. et al. Enabling rapid covid-19 small molecule drug design through scalable deep learning of generative models. \JournalTitleThe International Journal of High Performance Computing Applications 35, 469–482, DOI: 10.1177/10943420211010930 (2021). https://doi.org/10.1177/10943420211010930.
- Optimal experimental design for gene regulatory networks in the presence of uncertainty. \JournalTitleIEEE/ACM Transactions on Computational Biology and Bioinformatics 12, 938–950 (2014).
- Efficient experimental design for uncertainty reduction in gene regulatory networks. In BMC Bioinformatics, vol. 16, 1–18 (Springer, 2015).
- Optimal experimental design for uncertain systems based on coupled differential equations. \JournalTitleIEEE Access 9, 53804–53810 (2021).
- Accelerating optimal experimental design for robust synchronization of uncertain kuramoto oscillator model using machine learning. \JournalTitleIEEE Transactions on Signal Processing 69, 6473–6487, DOI: 10.1109/TSP.2021.3130967 (2021).
- Efficient active learning for gaussian process classification by error reduction. In Thirty-fifth Conference on Neural Information Processing Systems (2021).
- Uncertainty-aware active learning for optimal bayesian classifier. In International Conference on Learning Representations (2021).
- Bayesian active learning by soft mean objective cost of uncertainty. In International Conference on Artificial Intelligence and Statistics, 3970–3978 (PMLR, 2021).
- Quantifying the objective cost of uncertainty in complex dynamical systems. \JournalTitleIEEE Transactions on Signal Processing 61, 2256–2266 (2013).
- Quantifying the multi-objective cost of uncertainty. \JournalTitleIEEE Access 9, 80351–80359 (2021).
- Grammar variational autoencoder. In International conference on machine learning, 1945–1954 (PMLR, 2017).
- Deb, K. Multi-objective optimization using evolutionary algorithms (John Wiley & Sons, 2004).
- Niching and elitist models for mogas. In Parallel Problem Solving from Nature—PPSN V: 5th International Conference Amsterdam, The Netherlands September 27–30, 1998 Proceedings 5, 260–269 (Springer, 1998).
- Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. \JournalTitleAdvanced Drug Delivery Reviews 23, 3–25, DOI: https://doi.org/10.1016/S0169-409X(96)00423-1 (1997). In Vitro Models for Selection of Development Candidates.
- Prediction of physicochemical parameters by atomic contributions. \JournalTitleJournal of Chemical Information and Computer Sciences 39, 868–873, DOI: 10.1021/ci990307l (1999). https://doi.org/10.1021/ci990307l.
- Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions. \JournalTitleJournal of cheminformatics 1, 1–11 (2009).
- Harvey, A. L. Natural products in drug discovery. \JournalTitleDrug Discovery Today 13, 894–901, DOI: https://doi.org/10.1016/j.drudis.2008.07.004 (2008).
- Wang, S. et al. Structure of the d2 dopamine receptor bound to the atypical antipsychotic drug risperidone. \JournalTitleNature 555, 269–273, DOI: 10.1038/nature25758 (2018).
- The potential of dopamine receptor d2 (drd2) as a therapeutic target for tackling pancreatic cancer. \JournalTitleExpert Opinion on Therapeutic Targets 23, 365–367 (2019).
- Yeh, C.-T. et al. Trifluoperazine, an antipsychotic agent, inhibits cancer stem cell growth and overcomes drug resistance of lung cancer. \JournalTitleAmerican journal of respiratory and critical care medicine 186, 1180–1188 (2012).
- Yong, M. et al. Dr2 blocker thioridazine: A promising drug for ovarian cancer therapy corrigendum in/10.3892/ol. 2020.11285. \JournalTitleOncology letters 14, 8171–8177 (2017).
- Tung, M.-C. et al. Targeting drd2 by the antipsychotic drug, penfluridol, retards growth of renal cell carcinoma via inducing stemness inhibition and autophagy-mediated apoptosis. \JournalTitleCell death & disease 13, 400 (2022).
- Pedregosa, F. et al. Scikit-learn: Machine learning in Python. \JournalTitleJournal of Machine Learning Research 12, 2825–2830 (2011).
- A N M Nafiz Abeer (4 papers)
- Nathan Urban (3 papers)
- M Ryan Weil (1 paper)
- Francis J. Alexander (9 papers)
- Byung-Jun Yoon (33 papers)