Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Equivariant Matrix Function Neural Networks (2310.10434v2)

Published 16 Oct 2023 in stat.ML, cond-mat.mtrl-sci, cs.LG, and physics.chem-ph

Abstract: Graph Neural Networks (GNNs), especially message-passing neural networks (MPNNs), have emerged as powerful architectures for learning on graphs in diverse applications. However, MPNNs face challenges when modeling non-local interactions in graphs such as large conjugated molecules, and social networks due to oversmoothing and oversquashing. Although Spectral GNNs and traditional neural networks such as recurrent neural networks and transformers mitigate these challenges, they often lack generalizability, or fail to capture detailed structural relationships or symmetries in the data. To address these concerns, we introduce Matrix Function Neural Networks (MFNs), a novel architecture that parameterizes non-local interactions through analytic matrix equivariant functions. Employing resolvent expansions offers a straightforward implementation and the potential for linear scaling with system size. The MFN architecture achieves stateof-the-art performance in standard graph benchmarks, such as the ZINC and TU datasets, and is able to capture intricate non-local interactions in quantum systems, paving the way to new state-of-the-art force fields.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (89)
  1. Cormorant: Covariant molecular neural networks. In H. Wallach, H. Larochelle, A. Beygelzimer, F. AlcheBuc, E. Fox, and R. Garnett (eds.), Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019. URL https://proceedings.neurips.cc/paper/2019/file/03573b32b2746e6e8ca98b9123f2249b-Paper.pdf.
  2. Layer normalization. arXiv preprint arXiv:1607.06450, 2016.
  3. The design space of e(3)-equivariant atom-centered interatomic potentials, 2022a. URL https://arxiv.org/abs/2205.06643.
  4. Mace: Higher order equivariant message passing neural networks for fast and accurate force fields, 2022b. URL https://arxiv.org/abs/2206.07697.
  5. A general framework for equivariant neural networks on reductive lie groups, 2023.
  6. Relational inductive biases, deep learning, and graph networks. arXiv preprint arXiv:1806.01261, 2018.
  7. E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials. Nature Communications, 13(1):2453, 2022.
  8. Jörg Behler. Four generations of high-dimensional neural network potentials. Chemical Reviews, 121(16):10037–10072, 2021.
  9. Graph neural networks with convolutional ARMA filters. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021a.
  10. Graph neural networks with convolutional arma filters. IEEE transactions on pattern analysis and machine intelligence, 44(7):3496–3507, 2021b.
  11. Geometric and physical quantities improve e(3) equivariant message passing, 2022.
  12. Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges, 4 2021.
  13. Adaptive universal generalized pagerank graph neural network. In International Conference on Learning Representations, 2021.
  14. On the properties of neural machine translation: Encoder-decoder approaches. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp.  1724–1734, 2014.
  15. Steerable cnns. ICLR 2017, 2016. doi: 10.48550/ARXIV.1612.08498. URL https://arxiv.org/abs/1612.08498.
  16. Spherical CNNs. In International Conference on Learning Representations, 2018. URL https://openreview.net/forum?id=Hkbd5xZRb.
  17. J. M. Combes and L. Thomas. Asymptotic behaviour of eigenfunctions for multiparticle schrödinger operators. Communications in Mathematical Physics, 34(4):251–270, December 1973. doi: 10.1007/bf01646473. URL https://doi.org/10.1007/bf01646473.
  18. Tensor-reduced atomic density representations. Phys. Rev. Lett., 131:028001, Jul 2023. doi: 10.1103/PhysRevLett.131.028001. URL https://link.aps.org/doi/10.1103/PhysRevLett.131.028001.
  19. Timothy A. Davis. Direct Methods for Sparse Linear Systems. Society for Industrial and Applied Mathematics, 2006. doi: 10.1137/1.9780898718881. URL https://epubs.siam.org/doi/abs/10.1137/1.9780898718881.
  20. Natural graph networks. Advances in Neural Information Processing Systems, 33:3636–3646, 2020.
  21. Convolutional neural networks on graphs with fast localized spectral filtering. In Advances in Neural Information Processing Systems, pp. 3837–3845, 2016a.
  22. Convolutional neural networks on graphs with fast localized spectral filtering. Advances in neural information processing systems, 29, 2016b.
  23. On over-squashing in message passing neural networks: The impact of width, depth, and topology. arXiv preprint arXiv:2302.02941, 2023.
  24. Ralf Drautz. Atomic cluster expansion of scalar, vectorial, and tensorial properties including magnetism and charge transfer. Phys. Rev. B, 102:024104, Jul 2020. doi: 10.1103/PhysRevB.102.024104. URL https://link.aps.org/doi/10.1103/PhysRevB.102.024104.
  25. Benchmarking graph neural networks. arXiv preprint arXiv:2003.00982, 2020.
  26. Jeffrey L Elman. Finding structure in time. Cognitive Science, 14(2):179–211, 1990.
  27. Simon Etter. Incomplete selected inversion for linear-scaling electronic structure calculations, 2020.
  28. So3krates: Equivariant attention for interactions on arbitrary length-scales in molecular systems. Advances in Neural Information Processing Systems, 35:29400–29413, 2022.
  29. Self-consistent determination of long-range electrostatics in neural network potentials. Nature communications, 13(1):1572, 2022.
  30. Predict then propagate: Graph neural networks meet personalized pagerank. arXiv preprint arXiv:1810.05997, 2018.
  31. Neural message passing for quantum chemistry. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pp.  1263–1272. JMLR. org, 2017.
  32. Stefan Goedecker. Linear scaling electronic structure methods. Rev. Mod. Phys., 71:1085–1123, Jul 1999. doi: 10.1103/RevModPhys.71.1085. URL https://link.aps.org/doi/10.1103/RevModPhys.71.1085.
  33. Alex Graves. Generating sequences with recurrent neural networks. In arXiv preprint arXiv:1308.0850, 2013.
  34. Incorporating long-range physics in atomic-scale machine learning. The Journal of Chemical Physics, 151(20), 11 2019. ISSN 0021-9606. doi: 10.1063/1.5128375. URL https://doi.org/10.1063/1.5128375. 204105.
  35. DeePTB: A deep learning-based tight-binding approach with a⁢b𝑎𝑏abitalic_a italic_b i⁢n⁢i⁢t⁢i⁢o𝑖𝑛𝑖𝑡𝑖𝑜initioitalic_i italic_n italic_i italic_t italic_i italic_o accuracy. arXiv, July 2023.
  36. Inductive representation learning on large graphs. In Advances in Neural Information Processing Systems, pp. 1024–1034, 2017.
  37. Bernnet: Learning arbitrary graph spectral filters via bernstein approximation. Advances in Neural Information Processing Systems, 34:14239–14251, 2021.
  38. Machine-learned approximations to density functional theory hamiltonians. Sci. Rep., 7:42669, February 2017.
  39. Nicholas J Higham. Functions of matrices: theory and computation. SIAM, 2008.
  40. Long short-term memory. In Neural computation, volume 9,8, pp.  1735–1780. MIT Press, 1997.
  41. Deep graph pose: a semi-supervised deep graphical model for improved animal pose tracking. In Proceedings of the 27th ACM International Conference on Multimedia, pp.  1365–1374. ACM, 2019.
  42. Physics-inspired equivariant descriptors of non-bonded interactions, 2023.
  43. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning, pp. 448–456. pmlr, 2015.
  44. ZINC - a free database of commercially available compounds for virtual screening. Journal of Chemical Information and Modeling, 45(1):177–182, December 2004. doi: 10.1021/ci049714+. URL https://doi.org/10.1021/ci049714+.
  45. Anonymous walk embeddings. In Jennifer Dy and Andreas Krause (eds.), Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pp.  2191–2200, Stockholmsmässan, Stockholm Sweden, 10–15 Jul 2018. PMLR. URL http://proceedings.mlr.press/v80/ivanov18a.html.
  46. Perceiver io: A general architecture for structured inputs and outputs, 2022.
  47. A note on cospectral graphs. Journal of Combinatorial Theory, Series B, 28(1):96–103, 1980. ISSN 0095-8956. doi: https://doi.org/10.1016/0095-8956(80)90058-1. URL https://www.sciencedirect.com/science/article/pii/0095895680900581.
  48. On the expressive power of geometric graph neural networks. In International Conference on Machine Learning, 2023.
  49. Semi-supervised classification with graph convolutional networks. In Proceedings of the International Conference on Learning Representations (ICLR), 2017a.
  50. Semi-supervised classification with graph convolutional networks. In International Conference on Learning Representations (ICLR), 2017b.
  51. On the generalization of equivariance and convolution in neural networks to the action of compact groups. In Jennifer Dy and Andreas Krause (eds.), Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pp.  2747–2755. PMLR, 10–15 Jul 2018. URL https://proceedings.mlr.press/v80/kondor18a.html.
  52. Ewald-based long-range message passing for molecular graphs, 2023.
  53. Evaluation of the MACE Force Field Architecture: From Medicinal Chemistry to Materials Science. The Journal of Chemical Physics, 159(4):044118, July 2023. ISSN 0021-9606, 1089-7690. doi: 10.1063/5.0155322.
  54. Rethinking graph transformers with spectral attention. Advances in Neural Information Processing Systems, 34:21618–21629, 2021.
  55. Backpropagation applied to handwritten zip code recognition. Neural Computation, 1(4):541–551, 1989. doi: 10.1162/neco.1989.1.4.541.
  56. Fast algorithm for extracting the diagonal of the inverse matrix with application to the electronic structure analysis of metallic systems. Comm. Math. Sci., 7:755, 2009.
  57. Accelerating atomic orbital-based electronic structure calculation via pole expansion and selected inversion. J. Phys. Condens. Matter, 25:295501, 2013.
  58. Invariant and equivariant graph networks. arXiv preprint arXiv:1812.09902, 2018.
  59. Provably powerful graph networks. In Proceedings of the 33rd International Conference on Neural Information Processing Systems, pp.  2156–2167, 2019.
  60. Weisfeiler and leman go neural: Higher-order graph neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pp.  4602–4609, 2019.
  61. Propagation kernels: efficient graph kernels from propagated information. Machine Learning, 102(2):209–245, 2016.
  62. Learning convolutional neural networks for graphs. In International Conference on Machine Learning, pp. 2014–2023, 2016.
  63. Equivariant representations for molecular hamiltonians and n-center atomic-scale properties. J. Chem. Phys., 156, 2022.
  64. An assessment of the structural resolution of various fingerprints commonly used in machine learning. Machine Learning: Science and Technology, 2(1):015018, apr 2021. doi: 10.1088/2632-2153/abb212. URL https://dx.doi.org/10.1088/2632-2153/abb212.
  65. E(n) equivariant graph neural networks, 2021. URL https://arxiv.org/abs/2102.09844.
  66. Unifying machine learning and quantum chemistry with a deep neural network for molecular wavefunctions. Nat. Commun., 10(1):5024, November 2019.
  67. Protein folding using distance maps and deep learning. bioRxiv, 2018.
  68. Self-attention with relative position representations, 2018.
  69. Efficient graphlet kernels for large graph comparison. In Artificial Intelligence and Statistics, pp.  488–495, 2009.
  70. Dynamic edge-conditioned filters in convolutional neural networks on graphs. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.  3693–3702, 2017.
  71. SE (3)-equivariant prediction of molecular wavefunctions and electronic densities, 2021a. 35th Conference on Neural Information Processing Systems (NeurIPS 2021).
  72. PhysNet: A Neural Network for Predicting Energies, Forces, Dipole Moments, and Partial Charges. Journal of Chemical Theory and Computation, 15(6):3678–3693, 6 2019. ISSN 15499626. doi: 10.1021/acs.jctc.9b00181.
  73. SpookyNet: Learning force fields with electronic degrees of freedom and nonlocal effects. Nature Communications, 12(1), December 2021b. doi: 10.1038/s41467-021-27504-0. URL https://doi.org/10.1038/s41467-021-27504-0.
  74. Machine learning force fields. Chemical Reviews, 121(16):10142–10186, 2021c.
  75. Attention is all you need. Advances in neural information processing systems, 30:5998–6008, 2017.
  76. Graph Attention Networks. International Conference on Learning Representations, 2018. URL https://openreview.net/forum?id=rJXMpikCZ.
  77. Graph attention networks. In Proceedings of the International Conference on Learning Representations (ICLR), 2018.
  78. Graph kernels. Journal of Machine Learning Research, 11(Apr):1201–1242, 2010.
  79. How powerful are spectral graph neural networks. In International Conference on Machine Learning, pp. 23341–23362. PMLR, 2022.
  80. A comprehensive survey on graph neural networks. IEEE Transactions on Neural Networks and Learning Systems, 32(1):4–24, 2020.
  81. A comprehensive survey on graph neural networks. IEEE Transactions on Neural Networks and Learning Systems, 32(1):4–24, 2021. doi: 10.1109/TNNLS.2020.2978386.
  82. Capsule graph neural network. In International Conference on Learning Representations, 2019. URL https://openreview.net/forum?id=Byl8BnRcYm.
  83. How powerful are graph neural networks? In International Conference on Learning Representations, 2019. URL https://openreview.net/forum?id=ryGs6iA5Km.
  84. Deep graph kernels. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp.  1365–1374. ACM, 2015.
  85. Towards better graph representation learning with parameterized decomposition and filtering, 2023.
  86. Do transformers really perform bad for graph representation? arXiv preprint arXiv:2106.05234, 2021.
  87. Equivariant analytical mapping of first principles hamiltonians to accurate and transferable materials models. npj Computational Materials, 8, 2022. doi: https://doi.org/10.1038/s41524-022-00843-2. URL https://arxiv.org/abs/2111.13736.
  88. A fingerprint based metric for measuring similarities of crystalline structures. Journal of Chemical Physics, 144(3), 1 2016. doi: 10.1063/1.4940026.
  89. Interpreting and unifying graph neural networks with an optimization framework. In Proceedings of the Web Conference 2021, pp.  1215–1226, 2021.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Ilyes Batatia (18 papers)
  2. Lars L. Schaaf (4 papers)
  3. Huajie Chen (37 papers)
  4. Gábor Csányi (84 papers)
  5. Christoph Ortner (91 papers)
  6. Felix A. Faber (11 papers)
Citations (4)

Summary

Equivariant Matrix Function Neural Networks

The paper introduces Matrix Function Neural Networks (MFNs), a novel approach designed to address limitations in Graph Neural Networks (GNNs), particularly in capturing non-local interactions in graph-based data. The authors focus on overcoming challenges faced by existing Message-Passing Neural Networks (MPNNs), which typically struggle with oversmoothing and oversquashing in modeling large graphs with non-local dependencies, such as those found in complex chemical structures and social networks.

Key Contributions

  1. Introduction of Matrix Function Networks (MFNs): MFNs utilize analytic matrix functions that are equivariant to the graph's inherent symmetries, providing a powerful mechanism to parameterize non-local interactions. This approach contrasts with spectral GNNs, which often rely on Laplacian matrices, yet may fall short in capturing intricate structural relationships in data.
  2. Resolvent Expansion Technique: The paper employs resolvent expansions to facilitate the straightforward implementation of MFNs, enabling linear scaling with system size. This method leverages contour integration for approximating matrix functions, making it computationally efficient, especially suitable for large graphs often encountered in practical applications.
  3. Demonstrated Efficacy in Complex Systems: MFNs have been shown to achieve state-of-the-art performance on standard benchmarks such as the ZINC and TU datasets. Notably, the architecture excels in capturing non-local interactions in quantum systems—an area where traditional GNNs have been significantly challenged.
  4. Equivariance and Flexibility: The architecture incorporates group equivariance using techniques from equivariant neural networks. By doing so, it respects the symmetries present in various application domains, whether they are geometric or structural, allowing for broader applicability across different fields.

Implications and Future Directions

From a practical perspective, the ability of MFNs to model non-local interactions accurately holds great promise for advancements in fields such as materials science and quantum chemistry, where understanding the relationships within complex systems is crucial. Theoretically, the introduction of MFNs suggests new potential pathways for integrating non-Euclidean geometry into deep learning models, providing a richer framework for capturing the underlying physics of the tasks at hand.

Looking forward, future development could focus on extending MFN architectures to other equivariance-preserving group actions, potentially widening its applications. Additionally, there is a potential exploration into further optimizing the computational aspects to fully capitalize on linear scaling benefits in even larger and more complex datasets.

The introduction of MFNs represents a promising advance in the field of graph neural networks, providing meaningful contributions both in methodology and application results. Through improving expressivity and efficiency in handling non-local interactions, MFNs pave the way towards more versatile and powerful GNN deployments across diverse scientific and engineering disciplines.