Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
95 tokens/sec
Gemini 2.5 Pro Premium
32 tokens/sec
GPT-5 Medium
18 tokens/sec
GPT-5 High Premium
20 tokens/sec
GPT-4o
97 tokens/sec
DeepSeek R1 via Azure Premium
87 tokens/sec
GPT OSS 120B via Groq Premium
468 tokens/sec
Kimi K2 via Groq Premium
202 tokens/sec
2000 character limit reached

Barlow Twins Deep Neural Network for Advanced 1D Drug-Target Interaction Prediction (2408.00040v3)

Published 31 Jul 2024 in q-bio.BM, cs.AI, and cs.LG

Abstract: Accurate prediction of drug-target interactions is critical for advancing drug discovery. By reducing time and cost, machine learning and deep learning can accelerate this laborious discovery process. In a novel approach, BarlowDTI, we utilise the powerful Barlow Twins architecture for feature-extraction while considering the structure of the target protein. Our method achieves state-of-the-art predictive performance against multiple established benchmarks using only one-dimensional input. The use of gradient boosting machine as the underlying predictor ensures fast and efficient predictions without the need for substantial computational resources. We also investigate how the model reaches its decision based on individual training samples. By comparing co-crystal structures, we find that BarlowDTI effectively exploits catalytically active and stabilising residues, highlighting the model's ability to generalise from one-dimensional input data. In addition, we further benchmark new baselines against existing methods. Together, these innovations improve the efficiency and effectiveness of drug-target interaction predictions, providing robust tools for accelerating drug development and deepening the understanding of molecular interactions. Therefore, we provide an easy-to-use web interface that can be freely accessed at https://www.bio.nat.tum.de/oc2/barlowdti .

Definition Search Book Streamline Icon: https://streamlinehq.com
References (48)
  1. Rang & Dale’s Pharmacology. Elsevier Health Sciences, April 2011. ISBN 978-0-7020-4504-2.
  2. Stephen M. Strittmatter. Overcoming Drug Development Bottlenecks With Repurposing: Old drugs learn new tricks. Nature Medicine, 20(6):590–591, June 2014. ISSN 1546-170X. doi: 10.1038/nm.3595.
  3. Principles of early drug discovery. British Journal of Pharmacology, 162(6):1239–1249, 2011. ISSN 1476-5381. doi: 10.1111/j.1476-5381.2010.01127.x.
  4. Structural biology and bioinformatics in drug design: Opportunities and challenges for target identification and lead discovery. Philosophical Transactions of the Royal Society B: Biological Sciences, 361(1467):413–423, February 2006. doi: 10.1098/rstb.2005.1800.
  5. Christofer S. Tautermann. Current and Future Challenges in Modern Drug Discovery. In Alexander Heifetz, editor, Quantum Mechanics in Drug Discovery, pages 1–17. Springer US, New York, NY, 2020. ISBN 978-1-07-160282-9. doi: 10.1007/978-1-0716-0282-9˙1.
  6. Artificial intelligence in the prediction of protein–ligand interactions: Recent advances and future directions. Briefings in Bioinformatics, 23(1), January 2022. doi: 10.1093/bib/bbab476.
  7. Artificial intelligence in cancer target identification and drug discovery. Signal Transduction and Targeted Therapy, 7(1):1–24, May 2022. ISSN 2059-3635. doi: 10.1038/s41392-022-00994-0.
  8. Docking and scoring in virtual screening for drug discovery: Methods and applications. Nature Reviews Drug Discovery, 3(11):935–949, November 2004. ISSN 1474-1784. doi: 10.1038/nrd1549.
  9. Andrew L. Hopkins. Predicting promiscuity. Nature, 462(7270):167–168, November 2009. ISSN 1476-4687. doi: 10.1038/462167a.
  10. TransformerCPI: Improving compound–protein interaction prediction by sequence-based deep learning with self-attention mechanism and label reversal experiments. Bioinformatics, 36(16):4406–4414, August 2020. ISSN 1367-4803. doi: 10.1093/bioinformatics/btaa524.
  11. Drug–target affinity prediction using graph neural network and contact maps. RSC Advances, 10(35):20701–20712, May 2020. ISSN 2046-2069. doi: 10.1039/D0RA02297G.
  12. Physicochemical graph neural network for learning protein–ligand interaction fingerprints from sequence data. Nature Machine Intelligence, 6(6):673–687, June 2024. ISSN 2522-5839. doi: 10.1038/s42256-024-00847-1.
  13. DLM-DTI: A dual language model for the prediction of drug-target interaction with hint-based learning. Journal of Cheminformatics, 16(1):1–12, December 2024. ISSN 1758-2946. doi: 10.1186/s13321-024-00808-1.
  14. Sequence-based drug-target affinity prediction using weighted graph neural networks. BMC Genomics, 23(1):449, June 2022. ISSN 1471-2164. doi: 10.1186/s12864-022-08648-9.
  15. OpenFold: Retraining AlphaFold2 yields new insights into its learning mechanisms and capacity for generalization. Nature Methods, pages 1–11, May 2024. ISSN 1548-7105. doi: 10.1038/s41592-024-02272-z.
  16. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature, 630(8016):493–500, June 2024. ISSN 1476-4687. doi: 10.1038/s41586-024-07487-w.
  17. Generalized biomolecular modeling and design with RoseTTAFold All-Atom. Science, 384(6693):eadl2528, March 2024. doi: 10.1126/science.adl2528.
  18. AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. Journal of Computational Chemistry, 31(2):455–461, 2010. ISSN 1096-987X. doi: 10.1002/jcc.21334.
  19. DiffDock: Diffusion Steps, Twists, and Turns for Molecular Docking, February 2023.
  20. AlphaFold2 versus experimental structures: Evaluation on G protein-coupled receptors. Acta Pharmacologica Sinica, 44(1):1–7, January 2023. ISSN 1745-7254. doi: 10.1038/s41401-022-00938-y.
  21. Structure-aware Interactive Graph Neural Networks for the Prediction of Protein-Ligand Binding Affinity. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, KDD ’21, pages 975–985, New York, NY, USA, August 2021. Association for Computing Machinery. ISBN 978-1-4503-8332-5. doi: 10.1145/3447548.3467311.
  22. Bilingual Language Model for Protein Sequence and Structure, March 2024.
  23. Fast and accurate protein structure search with Foldseek. Nature Biotechnology, 42(2):243–246, February 2024. ISSN 1546-1696. doi: 10.1038/s41587-023-01773-0.
  24. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 785–794, August 2016. doi: 10.1145/2939672.2939785.
  25. Synergizing Chemical Structures and Bioassay Descriptions for Enhanced Molecular Property Prediction in Drug Discovery. Journal of Chemical Information and Modeling, 64(12):4640–4650, June 2024. ISSN 1549-9596. doi: 10.1021/acs.jcim.4c00765.
  26. Barlow Twins: Self-Supervised Learning via Redundancy Reduction, June 2021.
  27. Horace B Barlow et al. Possible principles underlying the transformation of sensory messages. Sensory communication, 1(01):217–233, 1961.
  28. BioSNAP Datasets: Stanford biomedical network dataset collection. http://snap.stanford.edu/biodata, August 2018.
  29. BindingDB: A web-accessible database of experimentally determined protein–ligand binding affinities. Nucleic Acids Research, 35(suppl_1):D198–D201, January 2007. ISSN 0305-1048. doi: 10.1093/nar/gkl999.
  30. Comprehensive analysis of kinase inhibitor selectivity. Nature Biotechnology, 29(11):1046–1051, November 2011. ISSN 1546-1696. doi: 10.1038/nbt.1990.
  31. DrugBank 6.0: The DrugBank Knowledgebase for 2024. Nucleic Acids Research, 52(D1):D1265–D1275, January 2024. ISSN 0305-1048. doi: 10.1093/nar/gkad976.
  32. Fine-tuning of BERT Model to Accurately Predict Drug–Target Interactions. Pharmaceutics, 14(8):1710, August 2022. ISSN 1999-4923. doi: 10.3390/pharmaceutics14081710.
  33. MolTrans: Molecular Interaction Transformer for drug–target interaction prediction. Bioinformatics, 37(6):830–836, March 2021. ISSN 1367-4803. doi: 10.1093/bioinformatics/btaa880.
  34. Contrastive learning in protein language space predicts interactions between drugs and protein targets. Proceedings of the National Academy of Sciences, 120(24):e2220778120, June 2023. doi: 10.1073/pnas.2220778120.
  35. Interpretable bilinear attention network with domain adaptation improves drug–target prediction. Nature Machine Intelligence, 5(2):126–136, February 2023. ISSN 2522-5839. doi: 10.1038/s42256-022-00605-1.
  36. Structure-Aware Multimodal Deep Learning for Drug–Protein Interaction Prediction. Journal of Chemical Information and Modeling, 62(5):1308–1317, March 2022. ISSN 1549-9596. doi: 10.1021/acs.jcim.2c00060.
  37. Do we need different machine learning algorithms for QSAR modeling? A comprehensive assessment of 16 machine learning algorithms on 14 QSAR data sets. Briefings in Bioinformatics, 22(4):bbaa321, July 2021. ISSN 1477-4054. doi: 10.1093/bib/bbaa321.
  38. Extreme Gradient Boosting as a Method for Quantitative Structure–Activity Relationships. Journal of Chemical Information and Modeling, 56(12):2353–2360, December 2016. ISSN 1549-9596. doi: 10.1021/acs.jcim.6b00591.
  39. Enhancing the prediction of student performance based on the machine learning XGBoost algorithm. Interactive Learning Environments, 31(6):3360–3379, August 2023. ISSN 1049-4820. doi: 10.1080/10494820.2021.1928235.
  40. B. L. WELCH. THE GENERALIZATION OF ‘STUDENT’S’ PROBLEM WHEN SEVERAL DIFFERENT POPULATION VARLANCES ARE INVOLVED. Biometrika, 34(1-2):28–35, January 1947. ISSN 0006-3444. doi: 10.1093/biomet/34.1-2.28.
  41. SciPy 1.0: Fundamental algorithms for scientific computing in Python. Nature Methods, 17(3):261–272, March 2020. ISSN 1548-7105. doi: 10.1038/s41592-019-0686-2.
  42. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society. Series B (Methodological), 57(1):289–300, 1995. ISSN 0035-9246.
  43. ProtTrans: Toward Understanding the Language of Life Through Self-Supervised Learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(10):7112–7127, October 2022. ISSN 1939-3539. doi: 10.1109/TPAMI.2021.3095381.
  44. An open source chemical structure curation pipeline using RDKit. Journal of Cheminformatics, 12(1):51, September 2020. ISSN 1758-2946. doi: 10.1186/s13321-020-00456-1.
  45. Rdkit/rdkit: 2020_03_1 (Q1 2020) Release. Zenodo, March 2020.
  46. Guido van Rossum. Python tutorial. (R 9526), January 1995.
  47. PyTorch: An Imperative Style, High-Performance Deep Learning Library, December 2019.
  48. Optuna: A Next-generation Hyperparameter Optimization Framework. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD ’19, pages 2623–2631, New York, NY, USA, July 2019. Association for Computing Machinery. ISBN 978-1-4503-6201-6. doi: 10.1145/3292500.3330701.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com