Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Towards DNA-Encoded Library Generation with GFlowNets (2404.10094v1)

Published 15 Apr 2024 in cs.LG and q-bio.QM

Abstract: DNA-encoded libraries (DELs) are a powerful approach for rapidly screening large numbers of diverse compounds. One of the key challenges in using DELs is library design, which involves choosing the building blocks that will be combinatorially combined to produce the final library. In this paper we consider the task of protein-protein interaction (PPI) biased DEL design. To this end, we evaluate several machine learning algorithms on the PPI modulation task and use them as a reward for the proposed GFlowNet-based generative approach. We additionally investigate the possibility of using structural information about building blocks to design a hierarchical action space for the GFlowNet. The observed results indicate that GFlowNets are a promising approach for generating diverse combinatorial library candidates.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (21)
  1. 2P2Idb: a structural database dedicated to orthosteric modulation of protein–protein interactions. Nucleic acids research, 41(D1):D824–D827, 2012.
  2. Flow network based generative models for non-iterative diverse candidate generation. Advances in Neural Information Processing Systems, 34:27381–27394, 2021.
  3. GFlowNet foundations. Journal of Machine Learning Research, 24(210):1–55, 2023.
  4. Fr-PPIChem: An academic compound library dedicated to protein–protein interactions. ACS chemical biology, 15(6):1566–1574, 2020.
  5. DNA-encoded chemical libraries: a comprehensive review with succesful stories and future challenges. ACS Pharmacology & Translational Science, 4(4):1265–1279, 2021.
  6. DNA-encoded chemistry: enabling the deeper sampling of chemical space. Nature Reviews Drug Discovery, 16(2):131–147, 2017.
  7. GFlowNets for AI-driven scientific discovery. Digital Discovery, 2(3):557–577, 2023.
  8. Machine learning on DNA-encoded library count data using an uncertainty-aware probabilistic loss function. Journal of Chemical Information and Modeling, 62(10):2316–2331, 2022.
  9. Trajectory balance: Improved credit assignment in GFlowNets. Advances in Neural Information Processing Systems, 35:5955–5967, 2022.
  10. Machine learning on DNA-encoded libraries: a new paradigm for hit finding. Journal of Medicinal Chemistry, 63(16):8857–8866, 2020.
  11. Crystal-GFN: sampling crystals with desirable properties and constraints. arXiv preprint arXiv:2310.04925, 2023.
  12. Chemical and structural lessons from recent successes in protein–protein interaction inhibition (2P2I). Current opinion in chemical biology, 15(4):475–481, 2011.
  13. The Metropolis—Hastings algorithm. Monte Carlo statistical methods, pp.  267–320, 2004.
  14. Extended-connectivity fingerprints. Journal of chemical information and modeling, 50(5):742–754, 2010.
  15. DNA-encoded chemical libraries. Nature Reviews Methods Primers, 2(1):3, 2022.
  16. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.
  17. ZINC 15–ligand discovery for everyone. Journal of chemical information and modeling, 55(11):2324–2337, 2015.
  18. Towards equilibrium molecular conformation generation with GFlowNets. arXiv preprint arXiv:2310.14782, 2023.
  19. Molformer: Motif-based transformer on 3D heterogeneous molecular graphs. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pp.  5312–5320, 2023.
  20. How powerful are graph neural networks? arXiv preprint arXiv:1810.00826, 2018a.
  21. Representation learning on graphs with jumping knowledge networks. In International conference on machine learning, pp.  5453–5462. PMLR, 2018b.
Citations (3)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com