Conditional Graph Information Bottleneck for Molecular Relational Learning (2305.01520v2)
Abstract: Molecular relational learning, whose goal is to learn the interaction behavior between molecular pairs, got a surge of interest in molecular sciences due to its wide range of applications. Recently, graph neural networks have recently shown great success in molecular relational learning by modeling a molecule as a graph structure, and considering atom-level interactions between two molecules. Despite their success, existing molecular relational learning methods tend to overlook the nature of chemistry, i.e., a chemical compound is composed of multiple substructures such as functional groups that cause distinctive chemical reactions. In this work, we propose a novel relational learning framework, called CGIB, that predicts the interaction behavior between a pair of graphs by detecting core subgraphs therein. The main idea is, given a pair of graphs, to find a subgraph from a graph that contains the minimal sufficient information regarding the task at hand conditioned on the paired graph based on the principle of conditional graph information bottleneck. We argue that our proposed method mimics the nature of chemical reactions, i.e., the core substructure of a molecule varies depending on which other molecule it interacts with. Extensive experiments on various tasks with real-world datasets demonstrate the superiority of CGIB over state-of-the-art baselines. Our code is available at https://github.com/Namkyeong/CGIB.
- Deep variational information bottleneck. arXiv preprint arXiv:1612.00410, 2016.
- Subgraph neural networks. Advances in Neural Information Processing Systems, 33:8017–8029, 2020.
- Simgnn: A neural network approach to fast graph similarity computation. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, pp. 384–392, 2019.
- Learning-based efficient graph similarity computation via multi-scale convolutional set matching. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pp. 3219–3226, 2020.
- Book, G. Compendium of chemical terminology. International Union of Pure and Applied Chemistry, 528, 2014.
- Bunke, H. On a relation between graph edit distance and maximum common subgraph. Pattern recognition letters, 18(8):689–694, 1997.
- A graph distance metric based on the maximal common subgraph. Pattern recognition letters, 19(3-4):255–259, 1998.
- Drug-drug adverse effect prediction with graph co-attention. arXiv preprint arXiv:1905.00534, 2019.
- Delaney, J. S. Esol: estimating aqueous solubility directly from molecular structure. Journal of chemical information and computer sciences, 44(3):1000–1005, 2004.
- Hypergraph neural networks. In Proceedings of the AAAI conference on artificial intelligence, volume 33, pp. 3558–3565, 2019.
- Neural message passing for quantum chemistry. In International conference on machine learning, pp. 1263–1272. PMLR, 2017.
- Tradeoffs in data augmentation: An empirical study. In International Conference on Learning Representations, 2020.
- Mathematical correlations for describing solute transfer into functionalized alkane solvents containing hydroxyl, ether, ester or ketone solvents. Fluid phase equilibria, 298(1):48–53, 2010.
- Learning deep representations by mutual information estimation and maximization. arXiv preprint arXiv:1808.06670, 2018.
- Hynes, J. T. Chemical reaction dynamics in solution. Annual Review of Physical Chemistry, 36(1):573–597, 1985.
- Categorical reparameterization with gumbel-softmax. arXiv preprint arXiv:1611.01144, 2016.
- Jerry, M. Advanced organic chemistry: reactions, mechanisms and structure, 1992.
- Experimental database of optical properties of organic compounds. Scientific data, 7(1):1–6, 2020.
- Deep learning optical spectroscopy based on experimental database: Potential applications to molecular design. JACS Au, 1(4):427–438, 2021.
- Predicting potential drug-drug interactions on topological and semantic similarity features using statistical learning. PloS one, 13(5):e0196865, 2018.
- Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907, 2016.
- Graph matching networks for learning the similarity of graph structured objects. In International conference on machine learning, pp. 3835–3845. PMLR, 2019.
- Delfos: deep learning model for prediction of solvation free energies in generic organic solvents. Chemical science, 10(36):8306–8315, 2019.
- Hierarchical graph matching networks for deep graph similarity learning. 2019.
- The concrete distribution: A continuous relaxation of discrete random variables. arXiv preprint arXiv:1611.00712, 2016.
- Minnesota solvation database (mnsol) version 2012. 2020.
- Subgraph pattern neural networks for high-order graph evolution prediction. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32, 2018.
- Interpretable and generalizable graph learning via stochastic attention mechanism. In International Conference on Machine Learning, pp. 15524–15543. PMLR, 2022.
- Freesolv: a database of experimental and calculated hydration free energies, with input files. Journal of computer-aided molecular design, 28(7):711–720, 2014.
- Estimation of solvation quantities from experimental thermodynamic data: Development of the comprehensive compsol databank for pure and mixed solutes. Journal of Physical and Chemical Reference Data, 46(3):033102, 2017.
- Ssi–ddi: substructure–substructure interactions for drug–drug interaction prediction. Briefings in Bioinformatics, 22(6):bbab133, 2021.
- Disentangled information bottleneck. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pp. 9285–9293, 2021.
- Chemically interpretable graph interaction network for prediction of pharmacokinetic properties of drug-like molecules. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pp. 873–880, 2020.
- Deepsynergy: predicting anti-cancer drug synergy with deep learning. Bioinformatics, 34(9):1538–1546, 2018.
- Fluorine in medicinal chemistry. Chemical Society Reviews, 37(2):320–330, 2008.
- Reichardt, C. Empirical parameters of the polarity of solvents. Angewandte Chemie International Edition in English, 4(1):29–40, 1965.
- A unified view of relational deep learning for drug pair scoring. arXiv preprint arXiv:2111.02916, 2021.
- Effect of solvent hydrogen bonding on the photophysical properties of intramolecular charge transfer probe trans-ethyl p-(dimethylamino) cinamate and its derivative. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy, 73(4):630–636, 2009.
- Graph structure learning with variational information bottleneck. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pp. 4165–4174, 2022.
- Contrastive multiview coding. In European conference on computer vision, pp. 776–794. Springer, 2020.
- The information bottleneck method. arXiv preprint physics/0004057, 2000.
- Graph attention networks. 2017.
- Deep graph infomax. arXiv preprint arXiv:1809.10341, 2018.
- Transfer learning for solvation free energies: From quantum chemistry to experiments. Chemical Engineering Journal, 418:129307, 2021.
- Drug—drug interaction through molecular structure similarity analysis. Journal of the American Medical Informatics Association, 19(6):1066–1074, 2012.
- Order matters: Sequence to sequence for sets. arXiv preprint arXiv:1511.06391, 2015.
- Multi-view graph contrastive representation learning for drug-drug interaction prediction. In Proceedings of the Web Conference 2021, pp. 2921–2933, 2021.
- Graph information bottleneck. Advances in Neural Information Processing Systems, 33:20437–20448, 2020.
- How powerful are graph neural networks? arXiv preprint arXiv:1810.00826, 2018.
- Neural network-based graph embedding for cross-platform binary code similarity detection. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 363–376, 2017.
- Graph contrastive learning with augmentations. Advances in Neural Information Processing Systems, 33:5812–5823, 2020.
- Graph information bottleneck for subgraph recognition. arXiv preprint arXiv:2010.05563, 2020.
- Improving subgraph recognition with variational graph information bottleneck. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 19396–19405, 2022.
- Predicting potential drug-drug interactions by integrating chemical, biological, phenotypic and network data. BMC bioinformatics, 18(1):1–12, 2017.
- H2mn: Graph similarity learning with hierarchical hypergraph matching networks. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, pp. 2274–2284, 2021.
- Learning with hypergraphs: Clustering, classification, and embedding. Advances in neural information processing systems, 19, 2006.
- Graph contrastive learning with adaptive augmentation. In Proceedings of the Web Conference 2021, pp. 2069–2080, 2021.
- Biosnap datasets: Stanford biomedical network dataset collection. Note: http://snap. stanford. edu/biodata Cited by, 5(1), 2018.