Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Compositional Deep Probabilistic Models of DNA Encoded Libraries (2310.13769v2)

Published 20 Oct 2023 in q-bio.QM and stat.ML

Abstract: DNA-Encoded Library (DEL) has proven to be a powerful tool that utilizes combinatorially constructed small molecules to facilitate highly-efficient screening assays. These selection experiments, involving multiple stages of washing, elution, and identification of potent binders via unique DNA barcodes, often generate complex data. This complexity can potentially mask the underlying signals, necessitating the application of computational tools such as machine learning to uncover valuable insights. We introduce a compositional deep probabilistic model of DEL data, DEL-Compose, which decomposes molecular representations into their mono-synthon, di-synthon, and tri-synthon building blocks and capitalizes on the inherent hierarchical structure of these molecules by modeling latent reactions between embedded synthons. Additionally, we investigate methods to improve the observation models for DEL count data such as integrating covariate factors to more effectively account for data noise. Across two popular public benchmark datasets (CA-IX and HRP), our model demonstrates strong performance compared to count baselines, enriches the correct pharmacophores, and offers valuable insights via its intrinsic interpretable structure, thereby providing a robust tool for the analysis of DEL data.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (7)
  1. Peterson, A. A.; Liu, D. R. Small-Molecule Discovery Through DNA-Encoded Libraries. Nat. Rev. Drug Discovery 2023, 1–24
  2. Binder, P.; Lawler, M.; Grady, L.; Carlson, N.; Leelananda, S.; Belyanskaya, S.; Franklin, J.; Tilmans, N.; Palacci, H. Partial Product Aware Machine Learning on DNA-Encoded Libraries. arXiv preprint arXiv:2205.08020 2022,
  3. Ma, R.; Dreiman, G. H.; Ruggiu, F.; Riesselman, A. J.; Liu, B.; James, K.; Sultan, M.; Koller, D. Regression Modeling on DNA Encoded Libraries. NeurIPS 2021 AI for Science Workshop. 2021
  4. Kingma, D. P.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv preprint arXiv:1412.6980 2014,
  5. Gawlikowski, J.; Tassi, C. R. N.; Ali, M.; Lee, J.; Humt, M.; Feng, J.; Kruspe, A.; Triebel, R.; Jung, P.; Roscher, R.; others A Survey of Uncertainty in Deep Neural Networks. Artificial Intelligence Review 2023, 1–77
  6. Heid, E.; McGill, C. J.; Vermeire, F. H.; Green, W. H. Characterizing Uncertainty in Machine Learning for Chemistry. J. Chem. Inf. Model. 2023,
  7. Landrum, G.; Tosco, P.; Kelley, B.; sriniker; gedeck; Ric; Vianello, R.; Schneider, N.; Dalke, A.; N, D. rdkit/rdkit: 2021_09_4 (Q3 2021) Release. 2021; https://doi.org/10.5281/zenodo.5835217, Accessed 2023-12-04
Citations (3)

Summary

We haven't generated a summary for this paper yet.