Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 170 tok/s
Gemini 2.5 Pro 51 tok/s Pro
GPT-5 Medium 45 tok/s Pro
GPT-5 High 36 tok/s Pro
GPT-4o 107 tok/s Pro
Kimi K2 196 tok/s Pro
GPT OSS 120B 445 tok/s Pro
Claude Sonnet 4.5 38 tok/s Pro
2000 character limit reached

Triplet Interaction Improves Graph Transformers: Accurate Molecular Graph Learning with Triplet Graph Transformers (2402.04538v2)

Published 7 Feb 2024 in cs.LG

Abstract: Graph transformers typically lack third-order interactions, limiting their geometric understanding which is crucial for tasks like molecular geometry prediction. We propose the Triplet Graph Transformer (TGT) that enables direct communication between pairs within a 3-tuple of nodes via novel triplet attention and aggregation mechanisms. TGT is applied to molecular property prediction by first predicting interatomic distances from 2D graphs and then using these distances for downstream tasks. A novel three-stage training procedure and stochastic inference further improve training efficiency and model performance. Our model achieves new state-of-the-art (SOTA) results on open challenge benchmarks PCQM4Mv2 and OC20 IS2RE. We also obtain SOTA results on QM9, MOLPCBA, and LIT-PCBA molecular property prediction benchmarks via transfer learning. We also demonstrate the generality of TGT with SOTA results on the traveling salesman problem (TSP).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (75)
  1. Fast matrix multiplication: limitations of the coppersmith-winograd method. In Proceedings of the forty-seventh annual ACM symposium on Theory of Computing, pp.  585–593, 2015.
  2. Cormorant: Covariant molecular neural networks. Advances in neural information processing systems, 32, 2019.
  3. Layer normalization. arXiv preprint arXiv:1607.06450, 2016.
  4. Directional graph networks. In International Conference on Machine Learning, pp. 748–758. PMLR, 2021.
  5. Geometric and physical quantities improve e (3) equivariant message passing. arXiv preprint arXiv:2110.02905, 2021.
  6. Residual gated graph convnets. arXiv preprint arXiv:1711.07553, 2017.
  7. Graph convolutions that can finally model local structure. arXiv preprint arXiv:2011.15069, 2020.
  8. Fp-gnn: a versatile deep learning architecture for enhanced molecular property prediction. Briefings in bioinformatics, 23(6):bbac408, 2022a.
  9. Automatic relation-aware graph network proliferation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  10863–10873, 2022b.
  10. Open catalyst 2020 (oc20) dataset and community challenges. Acs Catalysis, 11(10):6059–6072, 2021.
  11. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pp.  785–794, 2016.
  12. Generating long sequences with sparse transformers. arXiv preprint arXiv:1904.10509, 2019.
  13. Principal neighbourhood aggregation for graph nets. arXiv preprint arXiv:2004.05718, 2020.
  14. Support-vector networks. Machine learning, 20:273–297, 1995.
  15. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
  16. Pattern classification and scene analysis, volume 3. Wiley New York, 1973.
  17. Benchmarking graph neural networks. arXiv preprint arXiv:2003.00982, 2020.
  18. Chemrl-gem: Geometry enhanced molecular representation learning for property prediction. arXiv preprint arXiv:2106.06130, 2021.
  19. Se (3)-transformers: 3d roto-translation equivariant attention networks. Advances in neural information processing systems, 33:1970–1981, 2020.
  20. Fast and uncertainty-aware directional message passing for non-equilibrium molecules. In Machine Learning for Molecules Workshop, NeurIPS, 2020a.
  21. Directional message passing for molecular graphs. arXiv preprint arXiv:2003.03123, 2020b.
  22. Gemnet: Universal directional graph neural networks for molecules. Advances in Neural Information Processing Systems, 34:6790–6802, 2021.
  23. Neural message passing for quantum chemistry. In International conference on machine learning, pp. 1263–1272. PMLR, 2017.
  24. Simple gnn regularisation for 3d molecular property prediction & beyond. arXiv preprint arXiv:2106.07971, 2021.
  25. Halgren, T. A. Merck molecular force field. i. basis, form, scope, parameterization, and performance of mmff94. Journal of computational chemistry, 17(5-6):490–519, 1996.
  26. Open graph benchmark: Datasets for machine learning on graphs. arXiv preprint arXiv:2005.00687, 2020.
  27. Ogb-lsc: A large-scale challenge for machine learning on graphs. arXiv preprint arXiv:2103.09430, 2021.
  28. Global self-attention as a replacement for graph convolution. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp.  655–665, 2022.
  29. The information pathways hypothesis: Transformers are dynamic self-ensembles. Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp.  810–821, 2023.
  30. 3d equivariant molecular graph pretraining. arXiv preprint arXiv:2207.08824, 2022.
  31. Highly accurate protein structure prediction with alphafold. Nature, 596(7873):583–589, 2021.
  32. Pure transformers are powerful graph learners. Advances in Neural Information Processing Systems, 35:14582–14595, 2022.
  33. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907, 2016.
  34. Flag: Adversarial data augmentation for graph neural networks. arXiv preprint arXiv:2010.09891, 2020.
  35. Gns: A generalizable graph neural network-based simulator for particulate and fluid modeling. arXiv preprint arXiv:2211.10228, 2022.
  36. Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942, 2019.
  37. Landrum, G. Rdkit documentation. Release, 1(1-79):4, 2013.
  38. Parameterized hypercomplex graph neural networks for graph classification. In International Conference on Artificial Neural Networks, pp.  204–216. Springer, 2021.
  39. Equivariant graph attention networks for molecular property prediction. arXiv preprint arXiv:2202.09891, 2022.
  40. Deepergcn: All you need to train deeper gcns. arXiv preprint arXiv:2006.07739, 2020.
  41. Classification and regression by randomforest.
  42. Gem-2: Next generation molecular property prediction network with many-body and full-range interaction modeling. arXiv preprint arXiv:2208.05863, 2022a.
  43. Edge-enhanced attentions for drone delivery in presence of winds and recharging stations. Journal of Aerospace Information Systems, 20(4):216–228, 2023.
  44. Pre-training molecular graph representation with 3d geometry. arXiv preprint arXiv:2110.07728, 2021a.
  45. Molecular geometry pretraining with se (3)-invariant denoising distance matching. arXiv preprint arXiv:2206.13602, 2022b.
  46. Spherical message passing for 3d graph networks. arXiv preprint arXiv:2102.05013, 2021b.
  47. Highly accurate quantum chemical property prediction with uni-mol+. arXiv preprint arXiv:2303.16982, 2023.
  48. One transformer can understand both 2d & 3d molecular data. arXiv preprint arXiv:2210.01765, 2022.
  49. Gps++: An optimised hybrid mpnn/transformer for molecular property prediction. arXiv preprint arXiv:2212.02229, 2022.
  50. Grpe: Relative positional encoding for graph transformer. In ICLR2022 Machine Learning for Drug Discovery, 2022.
  51. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32, 2019.
  52. Quantum chemistry structures and properties of 134 kilo molecules. Scientific data, 1(1):1–7, 2014.
  53. Recipe for a general, powerful, scalable graph transformer. Advances in Neural Information Processing Systems, 35:14501–14515, 2022.
  54. Self-supervised graph transformer on large-scale molecular data. arXiv preprint arXiv:2007.02835, 2020.
  55. E (n) equivariant graph neural networks. In International conference on machine learning, pp. 9323–9332. PMLR, 2021.
  56. Schnet: A continuous-filter convolutional neural network for modeling quantum interactions. Advances in neural information processing systems, 30, 2017.
  57. Equivariant message passing for the prediction of tensorial properties and molecular spectra. In International Conference on Machine Learning, pp. 9377–9388. PMLR, 2021.
  58. Benchmarking graphormer on large-scale molecular modeling datasets. arXiv preprint arXiv:2203.04810, 2022.
  59. Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research, 15(1):1929–1958, 2014.
  60. 3d infomax improves gnns for molecular property prediction. In International Conference on Machine Learning, pp. 20479–20502. PMLR, 2022.
  61. Equivariant transformers for neural network based molecular potentials. In International Conference on Learning Representations, 2021.
  62. Torchmd-net: equivariant transformers for neural network based molecular potentials. arXiv preprint arXiv:2202.02541, 2022.
  63. Lit-pcba: an unbiased data set for machine learning and virtual screening. Journal of chemical information and modeling, 60(9):4263–4273, 2020.
  64. Physnet: A neural network for predicting energies, forces, dipole moments, and partial charges. Journal of chemical theory and computation, 15(6):3678–3693, 2019.
  65. Attention is all you need. In Advances in neural information processing systems, pp. 5998–6008, 2017.
  66. Graph attention networks. arXiv preprint arXiv:1710.10903, 2017.
  67. Dr-label: Improving gnn models for catalysis systems by label deconstruction and reconstruction. arXiv preprint arXiv:2303.02875, 2023a.
  68. On the global self-attention mechanism for graph convolutional networks. In 2020 25th International Conference on Pattern Recognition (ICPR), pp.  8531–8538. IEEE, 2021.
  69. Automated 3d pre-training for molecular property prediction. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp.  2419–2430, 2023b.
  70. Moleculenet: a benchmark for molecular machine learning. Chemical science, 9(2):513–530, 2018.
  71. Representing long-range context for graph neural networks with global attention. Advances in Neural Information Processing Systems, 34:13266–13279, 2021.
  72. How powerful are graph neural networks? arXiv preprint arXiv:1810.00826, 2018.
  73. Do transformers really perform bad for graph representation? arXiv preprint arXiv:2106.05234, 2021.
  74. Molecular geometry-aware transformer for accurate 3d atomic system modeling. arXiv preprint arXiv:2302.00855, 2023.
  75. Dropattention: a regularization method for fully-connected self-attention networks. arXiv preprint arXiv:1907.11065, 2019.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 1 tweet and received 0 likes.

Upgrade to Pro to view all of the tweets about this paper:

Youtube Logo Streamline Icon: https://streamlinehq.com