Why Deep Models Often cannot Beat Non-deep Counterparts on Molecular Property Prediction? (2306.17702v1)
Abstract: Molecular property prediction (MPP) is a crucial task in the drug discovery pipeline, which has recently gained considerable attention thanks to advances in deep neural networks. However, recent research has revealed that deep models struggle to beat traditional non-deep ones on MPP. In this study, we benchmark 12 representative models (3 non-deep models and 9 deep models) on 14 molecule datasets. Through the most comprehensive study to date, we make the following key observations: \textbf{(\romannumeral 1)} Deep models are generally unable to outperform non-deep ones; \textbf{(\romannumeral 2)} The failure of deep models on MPP cannot be solely attributed to the small size of molecular datasets. What matters is the irregular molecule data pattern; \textbf{(\romannumeral 3)} In particular, tree models using molecular fingerprints as inputs tend to perform better than other competitors. Furthermore, we conduct extensive empirical investigations into the unique patterns of molecule data and inductive biases of various models underlying these phenomena.
- Geometric deep learning on molecular representations. Nature Machine Intelligence, 2021.
- A simple representation of three-dimensional molecular structure. Journal of medicinal chemistry, 60(17):7393–7409, 2017.
- Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pp. 785–794, 2016.
- Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078, 2014.
- Convolutional embedding of attributed molecular graphs for physical property prediction. Journal of chemical information and modeling, 2017.
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL, 2019.
- SE(3) Equivariant Graph Neural Networks with Complete Local Frames. In ICML, 2022.
- Convolutional networks on graphs for learning molecular fingerprints. Advances in neural information processing systems, 28, 2015.
- Cosp: Co-supervised pretraining of pocket and ligand. arXiv preprint arXiv:2206.12241, 2022.
- Directional message passing for molecular graphs. In International Conference on Learning Representations, 2020. URL https://openreview.net/forum?id=B1eWbxStPH.
- Gemnet: Universal directional graph neural networks for molecules. Advances in Neural Information Processing Systems, 34:6790–6802, 2021.
- Neural message passing for quantum chemistry. In Proceedings of the 34th International Conference on Machine Learning, pp. 1263–1272, 2017.
- Deep learning for computational chemistry. Journal of computational chemistry, 38(16):1291–1307, 2017.
- Recent advances in convolutional neural networks. Pattern recognition, 77:354–377, 2018.
- Inductive representation learning on large graphs. Advances in neural information processing systems, 30, 2017.
- Smiles transformer: Pre-trained molecular fingerprint for low data drug discovery. arXiv preprint arXiv:1911.04738, 2019.
- Protein language models and structure prediction: Connection and progression. arXiv preprint arXiv:2211.16742, 2022.
- Strategies for pre-training graph neural networks. ICLR, 2020.
- Simple nearest-neighbour analysis meets the accuracy of compound potency predictions using complex machine learning models. Nature Machine Intelligence, pp. 1–10, 2022.
- Could graph neural networks learn better molecular representation for drug discovery? a comparison study of descriptor-based and graph-based models. Journal of cheminformatics, 13(1):1–23, 2021.
- Molecular Graph Convolutions: Moving Beyond Fingerprints. J. Comput. Aided Mol. Des., 2016.
- Maxsmi: maximizing molecular property prediction performance with confidence estimation using smiles augmentation and deep learning. Artificial Intelligence in the Life Sciences, 1:100014, 2021.
- Semi-Supervised Classification with Graph Convolutional Networks. In ICLR, 2017.
- Self-referencing embedded strings (selfies): A 100% robust molecular string representation. Machine Learning: Science and Technology, 1(4):045024, 2020.
- Landrum, G. RDKit: A software suite for cheminformatics, computational chemistry, and predictive modeling, 2013.
- An effective self-supervised framework for learning expressive molecular global representations to drug discovery. BIB, 2021.
- Spherical message passing for 3d molecular graphs. In Iclr, 2021.
- Spherical message passing for 3d molecular graphs. In International Conference on Learning Representations (ICLR), 2022.
- Dink-net: Neural clustering on large graphs. arXiv preprint arXiv:2305.18405, 2023.
- Molecular property prediction: A multilevel quantum interactions modeling perspective. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pp. 1052–1060, 2019.
- Maggiora, G. M. On outliers and activity cliffs why qsar often disappoints, 2006.
- Large-scale comparison of machine learning methods for drug target prediction on chembl. Chemical science, 9(24):5441–5451, 2018.
- Recurrent neural networks: design and applications. CRC press, 1999.
- Transformer for graphs: An overview from architecture perspective. arXiv preprint arXiv:2202.08455, 2022.
- A survey on the application of recurrent neural networks to statistical language modeling. Comput. Speech Lang., 30:61–98, 2015.
- Multiobjective tree-structured parzen estimator for computationally expensive optimization problems. Proceedings of the 2020 Genetic and Evolutionary Computation Conference, 2020.
- Deepdta: deep drug–target binding affinity prediction. Bioinformatics, 34(17):i821–i829, 2018.
- Molecular representation: going long on fingerprints. Chem, 6(6):1204–1207, 2020.
- On the spectral bias of neural networks. In International Conference on Machine Learning, pp. 5301–5310. PMLR, 2019.
- Extended-connectivity fingerprints. J chem inf, 2010.
- Self-supervised graph transformer on large-scale molecular data. Advances in Neural Information Processing Systems, 33:12559–12571, 2020.
- Molformer: Large Scale Chemical Language Representations Capture Molecular Structure and Properties. Nat. Mach. Intell., 2022.
- E(n) Equivariant Graph Neural Networks. In ICML, 2021.
- Schnet: A continuous-filter convolutional neural network for modeling quantum interactions. NIPS, 2017.
- Communicative representation learning on attributed molecular graphs. In IJCAI, volume 2020, pp. 2831–2838, 2020.
- Exploring activity cliffs in medicinal chemistry: miniperspective. Journal of medicinal chemistry, 55(7):2932–2942, 2012.
- Random forest: a classification and regression tool for compound classification and qsar modeling. Journal of chemical information and computer sciences, 43(6):1947–1958, 2003.
- Co-learning: Learning from noisy labels with self-supervision. In Proceedings of the 29th ACM International Conference on Multimedia, pp. 1405–1413, 2021.
- Global-context aware generative protein design. In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5. IEEE, 2023.
- Admetboost: a web server for accurate admet prediction. Journal of Molecular Modeling, 28(12):1–6, 2022.
- Molecular descriptors. Recent Advances in QSAR Studies, pp. 29–102, 2010.
- Predicting molecular activity on nuclear receptors by multitask neural networks. Journal of Chemometrics, 36(2):e3325, 2022.
- Exposing the limitations of molecular machine learning with activity cliffs. Journal of Chemical Information and Modeling, 62(23):5938–5951, 2022.
- Attention is all you need. Advances in neural information processing systems, 30, 2017.
- Graph Attention Networks. In ICLR, 2018.
- Smiles-bert - large scale unsupervised pre-training for molecular property prediction. BCB, 2019.
- Pubchem Bioassay: 2017 Update. Nucleic Acids Res., 2017.
- Smiles. 2. algorithm for generation of unique smiles notation. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1989.
- Moleculenet: a benchmark for molecular machine learning. Chemical science, 9(2):513–530, 2018.
- Towards Robust Graph Neural Networks against Label Noise, 2021. URL https://openreview.net/forum?id=H38f_9b90BO.
- Ot cleaner: Label correction as optimal transport. In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 3953–3957. IEEE, 2022a.
- SimGRACE: A Simple Framework for Graph Contrastive Learning without Data Augmentation. In Proceedings of The Web Conference 2022. Association for Computing Machinery, 2022b.
- ProGCL: Rethinking Hard Negative Mining in Graph Contrastive Learning. In ICML, 2022c.
- Towards effective and generalizable fine-tuning for pre-trained molecular graph models. bioRxiv, 2022d.
- A survey of pretraining on graphs: Taxonomy, methods, and applications. arXiv preprint arXiv:2202.07893, 2022e.
- Mole-BERT: Rethinking Pre-training Graph Neural Networks for Molecules. In ICLR, 2023a.
- A systematic survey of chemical pre-trained models. IJCAI, 2023b.
- Pushing the boundaries of molecular representation for drug discovery with the graph attention mechanism. Journal of medicinal chemistry, 63(16):8749–8760, 2019.
- Pushing the boundaries of molecular representation for drug discovery with graph attention mechanism. J Med Chem, 2020.
- Analyzing learned molecular representations for property prediction. J CHEM INF MODEL, 2019.
- Yap, C. W. Padel-descriptor: An open source software to calculate molecular descriptors and fingerprints. Journal of computational chemistry, 32(7):1466–1474, 2011.
- Do Transformers Really Perform Badly for Graph Representation? In NeurIPS, 2021.
- Graph contrastive learning with augmentations. In NeurIPS, 2020.
- A survey of deep graph clustering: Taxonomy, challenge, and application. arXiv preprint arXiv:2211.12875, 2022.
- Selformer: Molecular representation learning via selfies language models. arXiv preprint arXiv:2304.04662, 2023.
- Drug discovery using support vector machines. the case studies of drug-likeness, agrochemical-likeness, and enzyme inhibition predictions. Journal of chemical information and computer sciences, 43(6):2048–2056, 2003.
- Using context-to-vector with graph retrofitting to improve word embeddings. arXiv preprint arXiv:2210.16848, 2022.
- Identifying structure–property relationships through smiles syntax analysis with self-attention mechanism. Journal of chemical information and modeling, 59(2):914–923, 2019.
- Jun Xia (76 papers)
- Lecheng Zhang (3 papers)
- Xiao Zhu (9 papers)
- Stan Z. Li (222 papers)