Papers
Topics
Authors
Recent
Search
2000 character limit reached

AutoSGNN: Automated Propagation Discovery

Updated 4 March 2026
  • The paper demonstrates a hybrid LLM and evolutionary NAS approach that automates spectral GNN propagation design, yielding top validation accuracy across benchmark datasets.
  • AutoSGNN unifies propagation discovery for both homophilic and heterophilic graphs by integrating diverse spectral filters and aggregation rules into a modular search space.
  • The framework optimizes candidate architectures using a fitness function based on validation accuracy, achieving competitive efficiency compared to existing NAS methods.

Automatic Propagation Discovery (AutoSGNN) is a neural architecture search (NAS) framework for spectral graph neural networks (GNNs) that automates the design of propagation mechanisms. The approach targets both homogeneous and heterogeneous graph structures, with particular emphasis on unifying the discovery of propagation forms adaptable to varying homophily levels. AutoSGNN jointly leverages a LLM for generative architecture proposal and evolutionary strategies (ES) for iterative model selection, achieving state-of-the-art accuracy and efficiency across a spectrum of graph learning benchmarks (Mo et al., 2024).

1. Search Space Formalization for Spectral Propagation

AutoSGNN defines a unified, modular search space for spectral GNNs by abstracting most published spectral GNNs into the following universal layer-wise template (Eq. 3 of (Mo et al., 2024)):

Z(K)=TRANS(AGG{G;Xraw;Z(K1)})KZ^{(K)} = \left\langle \mathrm{TRANS} \bigl( AGG\{ G; X^{\rm raw}; Z^{(K-1)} \} \bigr) \right\rangle_{K}

Here, the architecture space comprises:

  • Feature-Fitting Terms: Weighted or residual connections, e.g., Z0=αXZ_0 = \alpha X (weighted self-feature), Z0=αX+γXrawZ_0 = \alpha X + \gamma X^{\mathrm{raw}}.
  • Spectral Propagation Operators: Polynomial spectral filters (gθ(Λ)=k=0KθkΛkg_\theta(\Lambda) = \sum_{k=0}^K \theta_k \Lambda^k), Chebyshev/Bernstein polynomial filters, adjacency powers (A^k\hat{A}^k), thresholded adjacency matrices, and attention-weighted adjacency.
  • Aggregation/Combination Rules: Summation, concatenation, residual addition, gated attention across layers.

The filter gθ(Λ)g_\theta(\Lambda), parametrized by θk\theta_k, translates into spatial propagation gθ(L)X=k=0KθkLkXg_\theta(L)X = \sum_{k=0}^K \theta_k L^k X, or in renormalized form as k=0KθkA^kX\sum_{k=0}^K \theta_k \hat{A}^k X. Each filter and aggregation operator forms a discrete or continuous NAS variable within the search grammar.

2. LLM-Driven Evolutionary Architecture Generation

AutoSGNN’s outer search loop builds upon a hybrid LLM-ES paradigm, with key algorithmic steps:

  • Representation: Each candidate GNN is encoded as (1) a Python class implementing the unified form, and (2) a human-readable "design-idea" description.
  • Prompt Types:
    • Mutation (E1): Requesting the LLM for a new spectral filter substantially distinct from elite (top-performing) candidates.
    • Crossover (E2): Prompting the LLM to hybridize features from multiple elite architectures.
    • Preference (C1): Having the LLM analyze the strengths/weaknesses of high-vs-low scoring designs, then propose improvements.
  • Search Loop: At each generation, NN new candidates are LLM-generated, trained, and evaluated in parallel. Fitness is assigned by validation accuracy, and the top EE candidates update the elite set.

The process repeats for TT generations; the globally best validation performer forms the final model (Mo et al., 2024).

3. Fitness Function and Optimization Criteria

AutoSGNN adopts pure validation-set accuracy for candidate ranking:

F(A)=Accval(A)=#correct predictions#val-nodesF(\mathcal{A}) = Acc_{val}(\mathcal{A}) = \frac{\# \text{correct predictions}}{\# \text{val-nodes}}

Optionally, efficiency-aware variants subtract a runtime penalty:

F(A)=Accval(A)λTime(A)TmaxF(\mathcal{A}) = Acc_{val}(\mathcal{A}) - \lambda\,\frac{Time(\mathcal{A})}{T_{\max}}

All architectures taking longer than a pre-set timeout (e.g., 600s) are removed from the population (F=F=-\infty).

4. Adaptation across Homophilic and Heterophilic Regimes

The search grammar is inherently capable of spanning both homophilic (neighboring nodes share class) and heterophilic (neighbors differ in class) propagation patterns by exposing both low-pass and high-pass spectral mechanisms:

  • Homophilic graphs: AutoSGNN’s search naturally gravitates to low-pass filters using powers of the renormalized adjacency (A^k\hat{A}^k), with dominant coefficients favoring smooth signal propagation.
  • Heterophilic graphs: The system’s prompt structure and grammar allow frequent emergence of thresholded adjacencies and residual feature-injection, enabling the network to focus on strong, anomalous connections or directly propagate node features.

The preference prompts in the LLM-ES pipeline empirically guide spectral filter proposal, matching estimated homophily ratios in the graph data (Mo et al., 2024).

5. Experimental Protocol and Comparative Results

Extensive experimentation substantiates the effectiveness of AutoSGNN on nine benchmark node-classification datasets, spanning both homophilic (Cora, Citeseer, PubMed, Amazon Computers/Photo) and heterophilic (Chameleon, Squirrel, Texas, Cornell) regimes. Protocol specifics:

  • Metric: Node-classification accuracy under 2.5%/2.5%/95% split (train/val/test).
  • Comparisons: State-of-the-art spectral GNNs (APPNP, GPRGNN, FAGCN, BernNet, JacobiConv, NFGNN, GCN, ChebNet) and NAS-based methods (GTFGNAS, F2GNN, Genetic-GNN, SANE).
  • Results: AutoSGNN achieved the top Wilcoxon–Holm rank (p=0.05p=0.05) on 7/9 datasets. For example, on Cora:
    • APPNP: 79.41±0.3879.41\pm0.38
    • GCN: 75.21±0.3875.21\pm0.38
    • AutoSGNN: 80.61±1.5280.61\pm1.52 (mean ±\pm std, 2.5% split)
  • Efficiency: Each run involves N=12N=12 candidates per generation, T=30T=30 generations (360 total). On PubMed, average full search is $176$ minutes, with LLM inference accounting for $107$ minutes. Compared to SANE/F2GNN (differentiable NAS, 80\sim80 min), and evolutionary NAS (GTFGNAS/Genetic-GNN, $250$–$300$ min), AutoSGNN exhibits competitive wall-clock efficiency (Mo et al., 2024).

6. Broader Methodological Significance

AutoSGNN’s unification of LLM-driven generative architecture proposals with evolutionary population refinement constitutes a hybrid NAS paradigm for GNNs. The success of AutoSGNN demonstrates that this approach can:

  • Generalize across spectrum from homophilic to heterophilic graphs without requiring human-curated filter design.
  • Search a broader architectural space than gradient-based (differentiable) NAS frameworks, due to non-reliance on continuous relaxations.
  • Incorporate input data characteristics (e.g., homophily statistics) implicitly into the search via preference-guided LLM prompting.

A plausible implication is that similar LLM+ES joint pipelines may accelerate NAS research in other domains where expert-crafted design grammars are insufficient for handling structural diversity, particularly for non-Euclidean or relational data.

7. Relation to Heterogeneous Network NAS Approaches

While AutoSGNN operates primarily in the spectral design domain and is agnostic to node/edge types, related research on NAS for heterogeneous information networks (e.g., AutoGNR (Li et al., 10 Jan 2025)) addresses type-aware propagation path discovery by non-recursive message passing. Such frameworks search over explicit node-type subset selections at each hop, using bi-level optimization over both model and architecture parameters, and demonstrate direct performance gains from suppressing uncorrelated type aggregations. A distinction thus emerges: AutoSGNN abstracts propagation in spectral space and supports type-agnostic adaptation, while frameworks like AutoGNR operate over explicit type paths and non-recursive aggregation, both advancing principled automated discovery of effective graph propagation mechanisms for complex domains.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Automatic Propagation Discovery (AutoSGNN).