G2rammar: Bilingual Grammar Modeling for Enhanced Text-attributed Graph Learning
Abstract: Text-attributed graphs require models to effectively integrate both structural topology and semantic content. Recent approaches apply LLMs to graphs by linearizing structures into token sequences through random walks. These methods create concise graph vocabularies to replace verbose natural language descriptions. However, they overlook a critical component that makes language expressive: grammar. In natural language, grammar assigns syntactic roles to words and defines their functions within sentences. Similarly, nodes in graphs play distinct structural roles as hubs, bridges, or peripheral members. Current graph language methods provide tokens without grammatical annotations to indicate these structural or semantic roles. This absence limits LLMs' ability to reason about graph topology effectively. We propose \textbf{G2rammar}, a bilingual grammar framework that explicitly encodes both structural and semantic grammar for text-attributed graphs. Structural grammar characterizes topological roles through centrality and neighborhood patterns. Semantic grammar captures content relationships through textual informativity. The framework implements two-stage learning with structural grammar pre-training followed by semantic grammar fine-tuning. Extensive experiments on real-world datasets demonstrate that G2rammar consistently outperforms competitive baselines by providing LLMs with the grammatical context needed to understand graph structures.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.