Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

ChebMixer: Efficient Graph Representation Learning with MLP Mixer (2403.16358v2)

Published 25 Mar 2024 in cs.CV

Abstract: Graph neural networks have achieved remarkable success in learning graph representations, especially graph Transformer, which has recently shown superior performance on various graph mining tasks. However, graph Transformer generally treats nodes as tokens, which results in quadratic complexity regarding the number of nodes during self-attention computation. The graph MLP Mixer addresses this challenge by using the efficient MLP Mixer technique from computer vision. However, the time-consuming process of extracting graph tokens limits its performance. In this paper, we present a novel architecture named ChebMixer, a newly graph MLP Mixer that uses fast Chebyshev polynomials-based spectral filtering to extract a sequence of tokens. Firstly, we produce multiscale representations of graph nodes via fast Chebyshev polynomial-based spectral filtering. Next, we consider each node's multiscale representations as a sequence of tokens and refine the node representation with an effective MLP Mixer. Finally, we aggregate the multiscale representations of nodes through Chebyshev interpolation. Owing to the powerful representation capabilities and fast computational properties of MLP Mixer, we can quickly extract more informative node representations to improve the performance of downstream tasks. The experimental results prove our significant improvements in a variety of scenarios ranging from graph node classification to medical image segmentation.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (56)
  1. Diffusion-convolutional neural networks. Advances in neural information processing systems, 29, 2016.
  2. Layer normalization. stat, 1050:21, 2016.
  3. Graph neural networks with convolutional arma filters. IEEE transactions on pattern analysis and machine intelligence, 44(7):3496–3507, 2021.
  4. Geometric deep learning: going beyond euclidean data. IEEE Signal Processing Magazine, 34(4):18–42, 2017.
  5. Geometric deep learning: Grids, groups, graphs, geodesics, and gauges. arXiv preprint arXiv:2104.13478, 2021.
  6. Spectral networks and deep locally connected networks on graphs. In 2nd International Conference on Learning Representations, ICLR 2014, 2014.
  7. Swin-unet: Unet-like pure transformer for medical image segmentation. In European conference on computer vision, pages 205–218. Springer, 2022.
  8. A comprehensive survey on geometric deep learning. IEEE Access, 8:35929–35949, 2020.
  9. Nagphormer: A tokenized graph transformer for node classification in large graphs. In Proceedings of the International Conference on Learning Representations, 2023.
  10. Adaptive universal generalized pagerank graph neural network. In International Conference on Learning Representations, 2021.
  11. Spectral graph theory. Number 92. American Mathematical Soc., 1997.
  12. Skin lesion analysis toward melanoma detection 2018: A challenge hosted by the international skin imaging collaboration (isic). arXiv preprint arXiv:1902.03368, 2019.
  13. Convolutional neural networks on graphs with fast localized spectral filtering. In Advances in Neural Information Processing Systems (NeurIPS), page 3844–3852, 2016.
  14. An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations, 2020.
  15. A generalization of transformer networks to graphs. CoRR, abs/2012.09699, 2020.
  16. Graph neural networks with learnable structural and positional representations. In International Conference on Learning Representations, 2022.
  17. UTNet: A hybrid transformer architecture for medical image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 61–71, 2021.
  18. Neural message passing for quantum chemistry. In International conference on machine learning, pages 1263–1272. PMLR, 2017.
  19. Inductive representation learning on large graphs. Advances in neural information processing systems, 30, 2017.
  20. Wavelets on graphs via spectral graph theory. Applied and Computational Harmonic Analysis, 30(2):129–150, 2011.
  21. Vision gnn: An image is worth graph of nodes. In NeurIPS, 2022.
  22. Swin unetr: Swin transformers for semantic segmentation of brain tumors in mri images. In International MICCAI Brainlesion Workshop, pages 272–284. Springer, 2021.
  23. Bernnet: Learning arbitrary graph spectral filters via bernstein approximation. Advances in Neural Information Processing Systems, 34:14239–14251, 2021.
  24. Convolutional neural networks on graphs with chebyshev approximation, revisited. In NeurIPS, 2022.
  25. A generalization of vit/mlp-mixer to graphs. In International Conference on Machine Learning, pages 12724–12745. PMLR, 2023.
  26. Hiformer: Hierarchical multi-scale representations using transformers for medical image segmentation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 6202–6212, 2023.
  27. Gaussian error linear units (gelus). arXiv preprint arXiv:1606.08415, 2016.
  28. Open graph benchmark: Datasets for machine learning on graphs. Advances in neural information processing systems, 33:22118–22133, 2020.
  29. Tackling over-smoothing for general graph convolutional networks. arXiv preprint arXiv:2008.09864, 2020.
  30. Jacob Devlin Ming-Wei Chang Kenton and Lee Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL-HLT, pages 4171–4186, 2019.
  31. Semi-supervised classification with graph convolutional networks. In International Conference on Learning Representations (ICLR), 2017.
  32. Predict then propagate: Graph neural networks meet personalized pagerank. In International Conference on Learning Representations (ICLR), 2019.
  33. Rethinking graph transformers with spectral attention. Advances in Neural Information Processing Systems, 34:21618–21629, 2021.
  34. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems. Curran Associates, Inc., 2012.
  35. Axial attention mlp-mixer: A new architecture for image segmentation. In 2022 IEEE Ninth International Conference on Communications and Electronics (ICCE), pages 381–386. IEEE, 2022.
  36. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision, pages 10012–10022, 2021.
  37. Decoupled weight decay regularization. In International Conference on Learning Representations, 2018.
  38. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems. Curran Associates, Inc., 2013.
  39. Graph neural networks exponentially lose expressive power for node classification. In International Conference on Learning Representations, 2020.
  40. Recipe for a general, powerful, scalable graph transformer. Advances in Neural Information Processing Systems, 35:14501–14515, 2022.
  41. The graph neural network model. IEEE Transactions on Neural Networks, 20(1):61–80, 2009.
  42. Mlp-mixer: An all-mlp architecture for vision. Advances in neural information processing systems, 34:24261–24272, 2021.
  43. Understanding over-squashing and bottlenecks on graphs via curvature. In International Conference on Learning Representations, 2022.
  44. Resmlp: Feedforward networks for image classification with data-efficient training. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(4):5314–5321, 2022.
  45. Graph attention networks. In International Conference on Learning Representations, 2018.
  46. Boundary-aware transformers for skin lesion segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 206–216. Springer, 2021.
  47. How powerful are spectral graph neural networks. In International Conference on Machine Learning, pages 23341–23362. PMLR, 2022.
  48. Dynamic graph cnn for learning on point clouds. ACM Transactions on Graphics (tog), 38(5):1–12, 2019.
  49. Expgcn: Review-aware graph convolution network for explainable recommendation. Neural Networks, 157:202–215, 2023.
  50. A comprehensive survey on graph neural networks. IEEE transactions on neural networks and learning systems, 32(1):4–24, 2020.
  51. Graph convolutional networks for text classification. In Proceedings of the AAAI conference on artificial intelligence, pages 7370–7377, 2019.
  52. Do transformers really perform badly for graph representation? In Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual, pages 28877–28888, 2021.
  53. Multi-scale mlp-mixer for image classification. Knowledge-Based Systems, 258:109792, 2022.
  54. Gaan: Gated attention networks for learning on large and spatiotemporal graphs. In 34th Conference on Uncertainty in Artificial Intelligence 2018, UAI 2018, 2018.
  55. Transfuse: Fusing transformers and cnns for medical image segmentation. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part I 24, pages 14–24. Springer, 2021.
  56. Interpreting and unifying graph neural networks with an optimization framework. In Proceedings of the Web Conference 2021, pages 1215–1226, 2021.

Summary

We haven't generated a summary for this paper yet.