Learning Multiplex Representations on Text-Attributed Graphs with One Language Model Encoder (2310.06684v2)
Abstract: In real-world scenarios, texts in a graph are often linked by multiple semantic relations (e.g., papers in an academic graph are referenced by other publications, written by the same author, or published in the same venue), where text documents and their relations form a multiplex text-attributed graph. Mainstream text representation learning methods use pretrained LLMs (PLMs) to generate one embedding for each text unit, expecting that all types of relations between texts can be captured by these single-view embeddings. However, this presumption does not hold particularly in multiplex text-attributed graphs. Along another line of work, multiplex graph neural networks (GNNs) directly initialize node attributes as a feature vector for node representation learning, but they cannot fully capture the semantics of the nodes' associated texts. To bridge these gaps, we propose METAG, a new framework for learning Multiplex rEpresentations on Text-Attributed Graphs. In contrast to existing methods, METAG uses one text encoder to model the shared knowledge across relations and leverages a small number of parameters per relation to derive relation-specific representations. This allows the encoder to effectively capture the multiplex structures in the graph while also preserving parameter efficiency. We conduct experiments on nine downstream tasks in five graphs from both academic and e-commerce domains, where METAG outperforms baselines significantly and consistently. The code is available at https://github.com/PeterGriffinJin/METAG.
- Persistent anti-muslim bias in large language models. In AIES, pages 298–306, 2021.
- Scibert: A pretrained language model for scientific text. arXiv preprint arXiv:1903.10676, 2019.
- Multidimensional networks: foundations of structural analysis. World Wide Web, 16:567–593, 2013.
- Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
- Electra: Pre-training text encoders as discriminators rather than generators. In ICLR, 2020.
- Specter: Document-level representation learning using citation-informed transformers. arXiv preprint arXiv:2004.07180, 2020.
- Bert: Pre-training of deep bidirectional transformers for language understanding. In NAACL-HLT, pages 4171–4186, 2019.
- metapath2vec: Scalable representation learning for heterogeneous networks. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pages 135–144, 2017.
- Heterogeneous network representation learning. In IJCAI, volume 20, pages 4861–4867, 2020.
- Representation learning on graphs: Methods and applications. arXiv preprint arXiv:1709.05584, 2017.
- Zellig S Harris. Distributional structure. Word, 10(2-3):146–162, 1954.
- Ups and downs: Modeling the visual evolution of fashion trends with one-class collaborative filtering. In WWW, pages 507–517, 2016.
- Heterogeneous graph transformer. In Proceedings of the web conference 2020, pages 2704–2710, 2020.
- Multi-behavior recommendation with graph convolutional networks. In SIGIR, pages 659–668, 2020.
- Patton: Language model pretraining on text-rich networks. In ACL, 2023a.
- Edgeformers: Graph-empowered transformers for representation learning on textual-edge networks. In ICLR, 2023b.
- Heterformer: Transformer-based deep node representation learning on heterogeneous text-rich networks. In KDD, 2023c.
- Hdmi: High-order deep multiplex infomax. In Proceedings of the Web Conference 2021, pages 2414–2424, 2021.
- Scaling laws for neural language models. arXiv preprint arXiv:2001.08361, 2020.
- Dense passage retrieval for open-domain question answering. arXiv preprint arXiv:2004.04906, 2020.
- Distributed representations of sentences and documents. In International conference on machine learning, pages 1188–1196. PMLR, 2014.
- The power of scale for parameter-efficient prompt tuning. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 3045–3059, 2021.
- Towards understanding and mitigating social biases in language models. In ICML, pages 6565–6576, 2021.
- Gpt understands, too. arXiv preprint arXiv:2103.10385, 2021.
- Multi-task deep neural networks for natural language understanding. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 4487–4496, 2019a.
- Roberta: A robustly optimized bert pretraining approach. arXiv:1907.11692, 2019b.
- Multi-dimensional network embedding with hierarchical structure. In Proceedings of the eleventh ACM international conference on web search and data mining, pages 387–395, 2018.
- Distributed representations of words and phrases and their compositionality. In NIPS, pages 3111–3119, 2013.
- Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748, 2018.
- Neighborhood contrastive learning for scientific document representations with citation embeddings. arXiv preprint arXiv:2202.06671, 2022.
- Unsupervised attributed multiplex network embedding. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 5371–5378, 2020.
- Learning how to ask: Querying lms with mixtures of soft prompts. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 5203–5212, 2021.
- An attention-based collaboration framework for multi-view network representation learning. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, pages 1767–1776, 2017.
- Sentence-bert: Sentence embeddings using siamese bert-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 11 2019. URL https://arxiv.org/abs/1908.10084.
- mvn2vec: Preservation and collaboration in multi-view network embedding. arXiv preprint arXiv:1801.06597, 2018.
- An overview of microsoft academic service (mas) and applications. In Proceedings of the 24th international conference on world wide web, pages 243–246, 2015.
- Mpnet: Masked and permuted pre-training for language understanding. Advances in Neural Information Processing Systems, 33:16857–16867, 2020.
- Pathsim: Meta path-based top-k similarity search in heterogeneous information networks. Proceedings of the VLDB Endowment, 4(11):992–1003, 2011.
- Laurens Van der Maaten and Geoffrey Hinton. Visualizing data using t-sne. Journal of machine learning research, 9(11), 2008.
- Attention is all you need. In NIPS, pages 5998–6008, 2017.
- Heterogeneous graph attention network. In The world wide web conference, pages 2022–2032, 2019.
- Heterogeneous graph neural network. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pages 793–803, 2019.
- Scalable multiplex network embedding. In IJCAI, volume 18, pages 3082–3088, 2018.
- The effect of metadata on scientific literature tagging: A cross-field cross-model study. In Proceedings of the ACM Web Conference 2023, pages 1626–1637, 2023.