Highly Accurate Quantum Chemical Property Prediction with Uni-Mol+ (2303.16982v2)
Abstract: Recent developments in deep learning have made remarkable progress in speeding up the prediction of quantum chemical (QC) properties by removing the need for expensive electronic structure calculations like density functional theory. However, previous methods learned from 1D SMILES sequences or 2D molecular graphs failed to achieve high accuracy as QC properties primarily depend on the 3D equilibrium conformations optimized by electronic structure methods, far different from the sequence-type and graph-type data. In this paper, we propose a novel approach called Uni-Mol+ to tackle this challenge. Uni-Mol+ first generates a raw 3D molecule conformation from inexpensive methods such as RDKit. Then, the raw conformation is iteratively updated to its target DFT equilibrium conformation using neural networks, and the learned conformation will be used to predict the QC properties. To effectively learn this update process towards the equilibrium conformation, we introduce a two-track Transformer model backbone and train it with the QC property prediction task. We also design a novel approach to guide the model's training process. Our extensive benchmarking results demonstrate that the proposed Uni-Mol+ significantly improves the accuracy of QC property prediction in various datasets. We have made the code and model publicly available at \url{https://github.com/dptech-corp/Uni-Mol}.
- Graph convolutions that can finally model local structure. arXiv preprint arXiv:2011.15069, 2020.
- Open catalyst 2020 (oc20) dataset and community challenges. Acs Catalysis, 11(10):6059–6072, 2021.
- Gemnet: Universal directional graph neural networks for molecules. In Marc’Aurelio Ranzato, Alina Beygelzimer, Yann N. Dauphin, Percy Liang, and Jennifer Wortman Vaughan, editors, Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual, pages 6790–6802, 2021.
- Fast and uncertainty-aware directional message passing for non-equilibrium molecules. In Machine Learning for Molecules Workshop, NeurIPS, 2020.
- Neural message passing for quantum chemistry. In International conference on machine learning, pages 1263–1272. PMLR, 2017.
- Simple GNN regularisation for 3d molecular property prediction and beyond. In International Conference on Learning Representations, 2022.
- Automatic chemical design using a data-driven continuous representation of molecules. ACS central science, 4(2):268–276, 2018.
- Energy-inspired molecular conformation optimization. In International Conference on Learning Representations, 2022.
- OGB-LSC: A large-scale challenge for machine learning on graphs. In Joaquin Vanschoren and Sai-Kit Yeung, editors, Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, NeurIPS Datasets and Benchmarks 2021, December 2021, virtual, 2021.
- Global self-attention as a replacement for graph convolution. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 655–665, 2022.
- Lietransformer: Equivariant self-attention for lie groups. In International Conference on Machine Learning, pages 4533–4543. PMLR, 2021.
- Commentary: The materials project: A materials genome approach to accelerating materials innovation. APL materials, 1(1):011002, 2013.
- Robert O Jones. Density functional theory: Its origins, rise to prominence, and future. Reviews of modern physics, 87(3):897, 2015.
- Highly accurate protein structure prediction with alphafold. Nature, 596(7873):583–589, 2021.
- Pure transformers are powerful graph learners. arXiv preprint arXiv:2207.02505, 2022.
- Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907, 2016.
- Directional message passing for molecular graphs 2020. arXiv preprint arXiv:2003.03123, 2003.
- Greg Landrum et al. Rdkit: Open-source cheminformatics software. 2016.
- Deepergcn: All you need to train deeper gcns. arXiv preprint arXiv:2006.07739, 2020.
- Equiformer: Equivariant graph attention transformer for 3d atomistic graphs. arXiv preprint arXiv:2206.11990, 2022.
- Gem-2: Next generation molecular property prediction network with many-body and full-range interaction modeling. arXiv preprint arXiv:2208.05863, 2022.
- Pre-training molecular graph representation with 3d geometry. arXiv preprint arXiv:2110.07728, 2021.
- Spherical message passing for 3d molecular graphs. In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022. OpenReview.net, 2022.
- One transformer can understand both 2d & 3d molecular data. arXiv preprint arXiv:2210.01765, 2022.
- Your transformer may not be as powerful as you expect. arXiv preprint arXiv:2205.13401, 2022.
- Gps++: An optimised hybrid mpnn/transformer for molecular property prediction. arXiv preprint arXiv:2212.02229, 2022.
- Grpe: Relative positional encoding for graph transformer. In ICLR2022 Machine Learning for Drug Discovery, 2022.
- Recipe for a general, powerful, scalable graph transformer. arXiv preprint arXiv:2205.12454, 2022.
- Large-scale chemical language representations capture molecular structure and properties. Nature Machine Intelligence, 4(12):1256–1264, 2022.
- Schnet: A continuous-filter convolutional neural network for modeling quantum interactions. Advances in neural information processing systems, 30, 2017.
- Benchmarking graphormer on large-scale molecular modeling datasets, 2022.
- 3d infomax improves gnns for molecular property prediction. In International Conference on Machine Learning, pages 20479–20502. PMLR, 2022.
- Equivariant transformers for neural network based molecular potentials. In International Conference on Learning Representations, 2022.
- Dr-label: Improving gnn models for catalysis systems by label deconstruction and reconstruction. arXiv preprint arXiv:2303.02875, 2023.
- Smiles-bert: large scale unsupervised pre-training for molecular property prediction. In Proceedings of the 10th ACM international conference on bioinformatics, computational biology and health informatics, pages 429–436, 2019.
- How powerful are graph neural networks? arXiv preprint arXiv:1810.00826, 2018.
- Do transformers really perform badly for graph representation? Advances in Neural Information Processing Systems, 34:28877–28888, 2021.
- Deep potential molecular dynamics: a scalable model with the accuracy of quantum mechanics. Physical review letters, 120(14):143001, 2018.
- Uni-mol: A universal 3d molecular representation learning framework. In The Eleventh International Conference on Learning Representations, 2023.
- Shuqi Lu (8 papers)
- Zhifeng Gao (37 papers)
- Di He (108 papers)
- Linfeng Zhang (160 papers)
- Guolin Ke (43 papers)