Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

GraSAME: Injecting Token-Level Structural Information to Pretrained Language Models via Graph-guided Self-Attention Mechanism (2404.06911v1)

Published 10 Apr 2024 in cs.CL

Abstract: Pretrained LLMs (PLMs) benefit from external knowledge stored in graph structures for various downstream tasks. However, bridging the modality gap between graph structures and text remains a significant challenge. Traditional methods like linearizing graphs for PLMs lose vital graph connectivity, whereas Graph Neural Networks (GNNs) require cumbersome processes for integration into PLMs. In this work, we propose a novel graph-guided self-attention mechanism, GraSAME. GraSAME seamlessly incorporates token-level structural information into PLMs without necessitating additional alignment or concatenation efforts. As an end-to-end, lightweight multimodal module, GraSAME follows a multi-task learning strategy and effectively bridges the gap between graph and textual modalities, facilitating dynamic interactions between GNNs and PLMs. Our experiments on the graph-to-text generation task demonstrate that GraSAME outperforms baseline models and achieves results comparable to state-of-the-art (SOTA) models on WebNLG datasets. Furthermore, compared to SOTA models, GraSAME eliminates the need for extra pre-training tasks to adjust graph inputs and reduces the number of trainable parameters by over 100 million.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (58)
  1. The complexity and hierarchical structure of tasks in insect societies. Animal Behaviour, 62(4):643–651.
  2. Graph-to-sequence learning using gated graph neural networks. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 273–283, Melbourne, Australia. Association for Computational Linguistics.
  3. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
  4. Neural data-to-text generation: A comparison between pipeline and end-to-end architectures. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 552–562, Hong Kong, China. Association for Computational Linguistics.
  5. KGPT: Knowledge-grounded pre-training for data-to-text generation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 8635–8648, Online. Association for Computational Linguistics.
  6. Learning sequential and structural information for source code summarization. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 2842–2851, Online. Association for Computational Linguistics.
  7. Control prefixes for parameter-efficient text generation. In Proceedings of the 2nd Workshop on Natural Language Generation, Evaluation, and Metrics (GEM), pages 363–382, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
  8. GAP: A graph-aware language model framework for knowledge graph-to-text generation. In Proceedings of the 29th International Conference on Computational Linguistics, pages 5755–5769, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
  9. Michael Denkowski and Alon Lavie. 2014. Meteor universal: Language specific translation evaluation for any target language. In Proceedings of the Ninth Workshop on Statistical Machine Translation, pages 376–380, Baltimore, Maryland, USA. Association for Computational Linguistics.
  10. Translation between molecules and natural language. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 375–413, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  11. Matthias Fey and Jan E. Lenssen. 2019. Fast graph representation learning with PyTorch Geometric. In ICLR Workshop on Representation Learning on Graphs and Manifolds.
  12. The WebNLG challenge: Generating text from RDF data. In Proceedings of the 10th International Conference on Natural Language Generation, pages 124–133, Santiago de Compostela, Spain. Association for Computational Linguistics.
  13. Dealing with hallucination and omission in neural natural language generation: A use case on meteorology. In Proceedings of the 15th International Conference on Natural Language Generation, pages 121–130, Waterville, Maine, USA and virtual meeting. Association for Computational Linguistics.
  14. Inductive representation learning on large graphs. Advances in neural information processing systems, 30.
  15. Have your text and use it too! end-to-end neural data-to-text generation with semantic fidelity. In Proceedings of the 28th International Conference on Computational Linguistics, pages 2410–2424, Barcelona, Spain (Online). International Committee on Computational Linguistics.
  16. BERT-MK: Integrating graph contextualized knowledge into pre-trained language models. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 2281–2290, Online. Association for Computational Linguistics.
  17. Parameter-efficient transfer learning for NLP. In Proceedings of the 36th International Conference on Machine Learning.
  18. Survey of hallucination in natural language generation. ACM Computing Surveys, 55(12):1–38.
  19. JointGT: Graph-text joint representation learning for text generation from knowledge graphs. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 2526–2538, Online. Association for Computational Linguistics.
  20. FactKG: Fact verification via reasoning on knowledge graphs. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 16190–16206, Toronto, Canada. Association for Computational Linguistics.
  21. Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907.
  22. Text Generation from Knowledge Graphs with Graph Transformers. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 2284–2293, Minneapolis, Minnesota. Association for Computational Linguistics.
  23. Few-shot knowledge graph-to-text generation with pretrained language models. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 1558–1568, Online. Association for Computational Linguistics.
  24. Pytorch distributed: Experiences on accelerating data parallel training. Proceedings of the VLDB Endowment, 13(12).
  25. Xiang Lisa Li and Percy Liang. 2021. Prefix-tuning: Optimizing continuous prompts for generation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 4582–4597, Online. Association for Computational Linguistics.
  26. Chin-Yew Lin. 2004. ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out, pages 74–81, Barcelona, Spain. Association for Computational Linguistics.
  27. GPT-too: A language-model-first approach for AMR-to-text generation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 1846–1852, Online. Association for Computational Linguistics.
  28. Graph hierarchy: a novel framework to analyse hierarchical structures in complex networks. Scientific Reports, 11(1):13943.
  29. Decomposed prompting: Unveiling multilingual linguistic structure knowledge in english-centric large language models. arXiv preprint arXiv:2402.18397.
  30. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pages 311–318, Philadelphia, Pennsylvania, USA. Association for Computational Linguistics.
  31. Knowledge enhanced contextual word representations. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 43–54, Hong Kong, China. Association for Computational Linguistics.
  32. James Pooler. 2017. Hierarchical Organization in Society. Routledge.
  33. Maja Popović. 2015. chrF: character n-gram F-score for automatic MT evaluation. In Proceedings of the Tenth Workshop on Statistical Machine Translation, pages 392–395, Lisbon, Portugal. Association for Computational Linguistics.
  34. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1):5485–5551.
  35. Enhancing AMR-to-text generation with dual graph representations. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3183–3194, Hong Kong, China. Association for Computational Linguistics.
  36. Investigating pretrained language models for graph-to-text generation. In Proceedings of the 3rd Workshop on Natural Language Processing for Conversational AI, pages 211–227, Online. Association for Computational Linguistics.
  37. Modeling global and local node contexts for text generation from knowledge graphs. Transactions of the Association for Computational Linguistics, 8:589–604.
  38. Structural adapters in pretrained language models for AMR-to-Text generation. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 4269–4282, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
  39. Modeling relational data with graph convolutional networks. In The Semantic Web: 15th International Conference, ESWC 2018, Heraklion, Crete, Greece, June 3–7, 2018, Proceedings 15, pages 593–607. Springer.
  40. Modeling graph structure via relative position for text generation from knowledge graphs. In Proceedings of the Fifteenth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-15), pages 10–21, Mexico City, Mexico. Association for Computational Linguistics.
  41. Anastasia Shimorina and Claire Gardent. 2018. Handling rare items in data-to-text generation. In Proceedings of the 11th International Conference on Natural Language Generation, pages 360–370, Tilburg University, The Netherlands. Association for Computational Linguistics.
  42. Structural information preserving for graph-to-text generation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7987–7998, Online. Association for Computational Linguistics.
  43. A graph-to-sequence model for AMR-to-text generation. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1616–1626, Melbourne, Australia. Association for Computational Linguistics.
  44. Attention is all you need. Advances in neural information processing systems, 30.
  45. Graph Attention Networks. International Conference on Learning Representations. Accepted as poster.
  46. K-Adapter: Infusing Knowledge into Pre-Trained Models with Adapters. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 1405–1418, Online. Association for Computational Linguistics.
  47. Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771.
  48. Pretrained encyclopedia: Weakly supervised knowledge-pretrained language model. ICLR.
  49. Rethinking network pruning – under the pre-train and fine-tune paradigm. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2376–2382, Online. Association for Computational Linguistics.
  50. Learning structural information for syntax-controlled paraphrase generation. In Findings of the Association for Computational Linguistics: NAACL 2022, pages 2079–2090, Seattle, United States. Association for Computational Linguistics.
  51. Shaowei Yao and Xiaojun Wan. 2020. Multimodal transformer for multimodal machine translation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 4346–4350, Online. Association for Computational Linguistics.
  52. Heterogeneous graph transformer for graph-to-sequence learning. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7145–7154, Online. Association for Computational Linguistics.
  53. QA-GNN: Reasoning with language models and knowledge graphs for question answering. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 535–546, Online. Association for Computational Linguistics.
  54. Hierarchical graph representation learning with differentiable pooling. Advances in neural information processing systems, 31.
  55. Shuzhou Yuan and Michael Faerber. 2023. Evaluating generative models for graph-to-text generation. In Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing, pages 1256–1264, Varna, Bulgaria. INCOMA Ltd., Shoumen, Bulgaria.
  56. Gnnavi: Navigating the information flow in large language models by graph neural network. arXiv preprint arXiv:2402.11709.
  57. Greaselm: Graph reasoning enhanced language models for question answering. In International Conference on Representation Learning (ICLR).
  58. ERNIE: Enhanced language representation with informative entities. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 1441–1451, Florence, Italy. Association for Computational Linguistics.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Shuzhou Yuan (12 papers)
  2. Michael Färber (65 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.