Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Target-Free Compound Activity Prediction via Few-Shot Learning (2311.16328v1)

Published 27 Nov 2023 in cs.LG and q-bio.QM

Abstract: Predicting the activities of compounds against protein-based or phenotypic assays using only a few known compounds and their activities is a common task in target-free drug discovery. Existing few-shot learning approaches are limited to predicting binary labels (active/inactive). However, in real-world drug discovery, degrees of compound activity are highly relevant. We study Few-Shot Compound Activity Prediction (FS-CAP) and design a novel neural architecture to meta-learn continuous compound activities across large bioactivity datasets. Our model aggregates encodings generated from the known compounds and their activities to capture assay information. We also introduce a separate encoder for the unknown compound. We show that FS-CAP surpasses traditional similarity-based techniques as well as other state of the art few-shot learning methods on a variety of target-free drug discovery settings and datasets.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (56)
  1. Recent advances in ligand-based drug design: relevance and utility of the conformationally sampled pharmacophore approach. Current computer-aided drug design, 7(1):10–22, 2011.
  2. Low data drug discovery with one-shot learning. ACS central science, 3(4):283–293, 2017.
  3. learn2learn: A library for Meta-Learning research. arXiv, August 2020.
  4. Why is tanimoto index an appropriate choice for fingerprint-based similarity calculations? Journal of cheminformatics, 7(1):1–13, 2015.
  5. The cancer cell line encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature, 483(7391):603–607, 2012.
  6. Image-based profiling for drug discovery: due for a machine-learning upgrade? Nature Reviews Drug Discovery, 20(2):145–159, 2021.
  7. Meta-learning adaptive deep kernel gaussian processes for molecular property prediction. In NeurIPS 2022 AI for Science: Progress and Promises, 2022.
  8. Graph prototypical networks for few-shot learning on attributed networks. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management, pp.  295–304, 2020.
  9. Model-agnostic meta-learning for fast adaptation of deep networks. In International conference on machine learning, pp. 1126–1135. PMLR, 2017.
  10. Conditional neural processes. In International Conference on Machine Learning, pp. 1704–1713. PMLR, 2018a.
  11. Neural processes. arXiv preprint arXiv:1807.01622, 2018b.
  12. Bindingdb in 2015: a public database for medicinal chemistry, computational chemistry and systems pharmacology. Nucleic acids research, 44(D1):D1045–D1053, 2016.
  13. How phenotypic screening influenced drug discovery: lessons from five years of practice. Assay and drug development technologies, 15(6):239–246, 2017.
  14. Moltrans: Molecular interaction transformer for drug–target interaction prediction. Bioinformatics, 37(6):830–836, 2021.
  15. Principles of early drug discovery. British journal of pharmacology, 162(6):1239–1249, 2011.
  16. Mol2vec: unsupervised machine learning approach with chemical intuition. Journal of chemical information and modeling, 58(1):27–35, 2018.
  17. Concepts and applications of molecular similarity. Wiley, 1990.
  18. Improved protein–ligand binding affinity prediction with structure-based deep fusion inference. Journal of chemical information and modeling, 61(4):1583–1592, 2021.
  19. A deep learning model for cell growth inhibition ic50 prediction and its application for gastric cancer patients. International journal of molecular sciences, 20(24):6276, 2019.
  20. Rocs-derived features for virtual screening. Journal of computer-aided molecular design, 30(8):609–617, 2016.
  21. Khan, A. U. et al. Descriptors and their selection methods in qsar analysis: paradigm for drug design. Drug discovery today, 21(8):1291–1302, 2016.
  22. Attentive neural processes. arXiv preprint arXiv:1901.05761, 2019.
  23. Qphar: quantitative pharmacophore activity relationship: method and validation. Journal of cheminformatics, 13(1):1–14, 2021.
  24. Quantifying sources of uncertainty in drug discovery predictions with probabilistic models. Artificial Intelligence in the Life Sciences, 1:100004, 2021.
  25. Metadta: Meta-learning-based drug-target binding affinity prediction. In ICLR Machine Learning for Drug Discovery Workshop, 2022.
  26. Simultaneous regression and classification for drug sensitivity prediction using an advanced random forest method. Scientific Reports, 12(1):1–13, 2022.
  27. Mol-bert: An effective molecular representation with bert for molecular property prediction. Wireless Communications and Mobile Computing, 2021, 2021.
  28. Quantitative structure–activity relationship for prediction of the toxicity of phenols on photobacterium phosphoreum. Bulletin of environmental contamination and toxicology, 89(1):27–31, 2012.
  29. Strategies for indirect computer-aided drug design. Pharmaceutical research, 10(4):475–486, 1993.
  30. The power metric: a new statistically robust enrichment-type metric for virtual screening applications with early recovery capability. Journal of Cheminformatics, 9(1):1–11, 2017.
  31. Predicting binding from screening assays with transformer network embeddings. Journal of Chemical Information and Modeling, 60(9):4191–4199, 2020.
  32. Meta networks. In International Conference on Machine Learning, pp. 2554–2563. PMLR, 2017.
  33. Meta-learning initializations for low-resource drug discovery. ChemRxiv, 2020.
  34. Deepdta: deep drug–target binding affinity prediction. Bioinformatics, 34(17):i821–i829, 2018.
  35. Artificial intelligence in drug discovery and development. Drug discovery today, 26(1):80, 2021.
  36. Protein–ligand scoring with convolutional neural networks. Journal of chemical information and modeling, 57(4):942–957, 2017.
  37. Optimization as a model for few-shot learning. International Conference on Learning Representations, 2016.
  38. Extended-connectivity fingerprints. Journal of chemical information and modeling, 50(5):742–754, 2010.
  39. Pacoh: Bayes-optimal meta-learning with pac-guarantees. In International Conference on Machine Learning, pp. 9116–9126. PMLR, 2021.
  40. A generalized framework for embedding-based few-shot learning methods in drug discovery. ELLIS Machine Learning for Molecules workshop, 2021.
  41. Non-gaussian gaussian processes for few-shot regression. Advances in Neural Information Processing Systems, 34:10285–10298, 2021.
  42. Prototypical networks for few-shot learning. Advances in neural information processing systems, 30, 2017.
  43. Multi-scale representation learning on proteins. Advances in Neural Information Processing Systems, 34:25244–25255, 2021.
  44. Fs-mol: A few-shot learning dataset of molecules. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2), 2021.
  45. Development and evaluation of a deep learning model for protein–ligand binding affinity prediction. Bioinformatics, 34(21):3666–3674, 2018.
  46. Recent advances in phenotypic drug discovery. F1000Research, 9, 2020.
  47. Adaptive deep kernel learning. arXiv preprint arXiv:1905.12131, 2019.
  48. Virtual screening workflow development guided by the “receiver operating characteristic” curve approach. application to high-throughput docking on metabotropic glutamate receptor subtype 4. Journal of medicinal chemistry, 48(7):2534–2547, 2005.
  49. Applications of machine learning in drug discovery and development. Nature reviews Drug discovery, 18(6):463–477, 2019.
  50. Visualizing data using t-sne. Journal of machine learning research, 9(11), 2008.
  51. Few-shot learning for low-data drug discovery. Journal of Chemical Information and Modeling, 2022.
  52. Matching networks for one shot learning. Advances in neural information processing systems, 29, 2016.
  53. Pubchem’s bioassay database. Nucleic acids research, 40(D1):D400–D412, 2012.
  54. Generalizing from a few examples: A survey on few-shot learning. ACM computing surveys (csur), 53(3):1–34, 2020.
  55. Resatom system: protein and ligand affinity prediction model based on deep learning. arXiv preprint arXiv:2105.05125, 2021.
  56. Hit identification and optimization in virtual screening: Practical recommendations based on a critical literature analysis: Miniperspective. Journal of medicinal chemistry, 56(17):6560–6572, 2013.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Peter Eckmann (6 papers)
  2. Jake Anderson (1 paper)
  3. Rose Yu (84 papers)
  4. Michael K. Gilson (6 papers)
Citations (1)