Fast Quantum Property Prediction via Deeper 2D and 3D Graph Networks (2106.08551v1)

Published 16 Jun 2021 in cs.LG

Abstract: Molecular property prediction is gaining increasing attention due to its diverse applications. One task of particular interests and importance is to predict quantum chemical properties without 3D equilibrium structures. This is practically favorable since obtaining 3D equilibrium structures requires extremely expensive calculations. In this work, we design a deep graph neural network to predict quantum properties by directly learning from 2D molecular graphs. In addition, we propose a 3D graph neural network to learn from low-cost conformer sets, which can be obtained with open-source tools using an affordable budget. We employ our methods to participate in the 2021 KDD Cup on OGB Large-Scale Challenge (OGB-LSC), which aims to predict the HOMO-LUMO energy gap of molecules. Final evaluation results reveal that we are one of the winners with a mean absolute error of 0.1235 on the holdout test set. Our implementation is available as part of the MoleculeX package (https://github.com/divelab/MoleculeX).

Citations (16)

View on Semantic Scholar

Summary

The paper introduces novel deep 2D and 3D graph architectures to efficiently predict quantum chemical properties without the need for expensive 3D equilibrium structures.
It employs advanced networks like DeeperGCN and DAGNN to enhance model expressivity, achieving a mean absolute error of 0.1235 in HOMO-LUMO gap prediction.
The dual approach significantly reduces computational costs and sets a robust foundation for future advancements in drug discovery and materials engineering.

Fast Quantum Property Prediction via Deeper 2D and 3D Graph Networks

The paper "Fast Quantum Property Prediction via Deeper 2D and 3D Graph Networks" addresses a significant challenge in computational chemistry: the prediction of quantum chemical properties without requiring the expensive computation of 3D equilibrium structures. It presents an innovative approach using deep graph neural networks (GNNs) to predict such properties directly from 2D molecular graphs, which reduces computational costs while maintaining accuracy.

Summary of Methods

The proposed solution capitalizes on two novel graph network architectures: one for 2D molecular graphs and another for low-cost 3D conformer sets. The 2D model leverages advanced deep graph network structures such as DeeperGCN and DAGNN, which enhance the expressivity and receptive field in processing 2D molecular structures characterized by nodes (atoms) and edges (bonds). This deep processing mitigates limitations of existing shallow networks, thereby facilitating the accurate prediction of chemical properties.

Concurrently, the 3D model integrates low-cost conformer sets, produced with open-source tools, into the learning process. Given that 3D structures closely influence molecular properties in quantum chemistry, this inclusion bridges the knowledge gap left by solely 2D approaches. The conformer sets provide imprecise yet significant 3D information that aids in achieving better prediction results.

This dual approach was validated initially on the PCQM4M-LSC dataset from the 2021 KDD Cup, focusing on HOMO-LUMO energy gap prediction. Notably, this approach resulted in a mean absolute error (MAE) of 0.1235 on the test set, a significant achievement as indicated by their competitive placement in the contest.

Implications and Future Prospects

The research exemplifies the effectiveness of deep graph neural networks in molecular prediction tasks, setting a robust foundation for future explorations in computational chemistry and material science. From a practical perspective, the methodology reduces the time and computational resources needed to predict molecular properties, vital for applications in drug discovery and materials engineering.

Theoretically, this paper illustrates how deeper architectures with enlarged receptive fields outperform shallower counterparts in graph-based learning tasks, suggesting avenues for development in designing more sophisticated GNNs. By integrating richer 3D data from conformer sets, the approach provides empirical evidence that combining multi-source information enhances learning outcomes.

Future directions inspired by this work may involve refining the conformer generation process to capture more granulate 3D molecular details, integrating additional physical and chemical knowledge to guide GNN learning, or extending the application to other molecular properties beyond the HOMO-LUMO gap. Moreover, the paradigm established here could guide structural learning in other domains where mapping high-dimensional data into lower-dimensional representations is beneficial.

The approach and methodology presented in this paper enhance the scope of quantum chemical property prediction and lay the groundwork for further advancements in leveraging graph theory and deep learning in scientific computing. The implementation is accessible within the MoleculeX package, which can aid both academic and industrial researchers in replicating or building upon this impactful research.

PDF Markdown

Related Papers

GitHub

GitHub - divelab/MoleculeX (158 stars)