Papers
Topics
Authors
Recent
Search
2000 character limit reached

Light Graph Convolution (LGC)

Updated 7 April 2026
  • Light Graph Convolution (LGC) is a streamlined graph neural network that removes non-linear activations and feature transformations to retain only normalized neighborhood aggregation.
  • It achieves state-of-the-art performance in recommendation systems by efficiently combining multi-layer embeddings, yielding significant improvements such as a +16.5% gain in Recall@20 over deeper models.
  • LGC extends to multi-criteria recommendation and graph classification while mitigating over-smoothing through uniform aggregation and essential normalization techniques.

Light Graph Convolution (LGC) is a class of graph neural network (GNN) architectures designed to deliver expressive, scalable, and efficient message-passing operations for graph-based learning tasks while minimizing architectural complexity. The defining characteristic of LGC is aggressive simplification: LGC models eschew both nonlinear activation functions and learnable feature transformation matrices within each convolutional layer, retaining only neighborhood aggregation with normalization. This approach has demonstrated state-of-the-art performance in collaborative filtering (CF), multi-criteria recommendation, and graph classification, establishing that, for many graph domains, “less is more.”

1. Theoretical Foundations and Core Design

Standard GCN layers typically consist of three operations: feature transformation (multiplication by learnable matrices), neighborhood aggregation, and application of pointwise nonlinear activation (e.g., ReLU). LGC deliberately removes both feature transformation and nonlinear activation, retaining only normalized linear neighborhood aggregation. This design is motivated by empirical findings that, in collaborative filtering graphs where nodes are represented as ID embeddings without attribute features, both transformations and nonlinearities contribute little and may even degrade performance. The result is a highly streamlined layer structure directly aggregating information from normalized neighbors, followed by linear combination of multiple layers' outputs to form the final node representation (He et al., 2020, Gao et al., 2020, Park et al., 2023).

2. Mathematical Formulation of LGC Layers

A canonical LGC layer propagates embeddings as follows:

  • For user–item bipartite graphs:

    • User nodes:

    eu(k+1)=iNu1NuNiei(k)e_u^{(k+1)} = \sum_{i \in \mathcal N_u} \frac{1}{\sqrt{|\mathcal N_u|}\sqrt{|\mathcal N_i|}} e_i^{(k)} - Item nodes:

    ei(k+1)=uNi1NiNueu(k)e_i^{(k+1)} = \sum_{u \in \mathcal N_i} \frac{1}{\sqrt{|\mathcal N_i|}\sqrt{|\mathcal N_u|}} e_u^{(k)}

For general graphs with node features XRn×dX \in \mathbb R^{n \times d}, multi-hop kernels TkT_k can be used for kk-order aggregation:

Zk=TkXWZ^k = T_k X W

where WW is either omitted entirely (as in pure LGC) or shared across all hops in "light" multi-order extensions (Gao et al., 2020).

After KK layers, the final node embedding is formed as a (typically uniform) weighted sum over all propagation layers:

e=k=0Kαke(k)e^* = \sum_{k=0}^K \alpha_k e^{(k)}

with αk0\alpha_k \geq 0, ei(k+1)=uNi1NiNueu(k)e_i^{(k+1)} = \sum_{u \in \mathcal N_i} \frac{1}{\sqrt{|\mathcal N_i|}\sqrt{|\mathcal N_u|}} e_u^{(k)}0. For most settings, uniform weighting ei(k+1)=uNi1NiNueu(k)e_i^{(k+1)} = \sum_{u \in \mathcal N_i} \frac{1}{\sqrt{|\mathcal N_i|}\sqrt{|\mathcal N_u|}} e_u^{(k)}1 suffices.

The prediction score in recommendation is given by the inner product ei(k+1)=uNi1NiNueu(k)e_i^{(k+1)} = \sum_{u \in \mathcal N_i} \frac{1}{\sqrt{|\mathcal N_i|}\sqrt{|\mathcal N_u|}} e_u^{(k)}2. In MC recommendation, additional channels model user- and criterion-specific preferences (Park et al., 2023).

3. Algorithmic Structure and Training Procedures

A representative training loop for LGC, as per LightGCN, comprises:

  • Initialization of ID embeddings for nodes.
  • Precomputing degree matrices and normalized adjacency for efficient propagation.
  • Batchwise propagation of embeddings across ei(k+1)=uNi1NiNueu(k)e_i^{(k+1)} = \sum_{u \in \mathcal N_i} \frac{1}{\sqrt{|\mathcal N_i|}\sqrt{|\mathcal N_u|}} e_u^{(k)}3 layers with pure aggregation (sparse matrix multiplication).
  • Weighted combination of layer-wise embeddings.
  • Scoring user–item (or node–node) pairs via inner product.
  • Bayesian Personalized Ranking (BPR) loss:

ei(k+1)=uNi1NiNueu(k)e_i^{(k+1)} = \sum_{u \in \mathcal N_i} \frac{1}{\sqrt{|\mathcal N_i|}\sqrt{|\mathcal N_u|}} e_u^{(k)}4

where only the initial embeddings are regularized.

  • Backpropagation only with respect to the initial embeddings (since layers have no parameters).

Pseudocode for such an LGC training regime is detailed in (He et al., 2020).

4. LGC in Collabrative Filtering and Multi-Criteria Recommendation

In collaborative filtering, LGC (as instantiated by LightGCN) substantially outperforms deeper GCNs such as NGCF (Neural Graph Collaborative Filtering), Mult-VAE, and Laplacian-regularized MF frameworks. Across large benchmarks (Gowalla, Yelp2018, Amazon-Book), LightGCN achieves a +16.5% average improvement in Recall@20 and NDCG@20 over NGCF (He et al., 2020).

For multi-criteria recommendation, CPA-LGC extends the LGC design to expanded bipartite MC graphs, incorporating two embedding channels per node: the standard structural embeddings (propagated via LGC) and user/item-specific criteria preference embeddings. The MC expansion graph includes user nodes, item-criterion nodes, and weighted edges reflecting multi-criteria ratings. User-specific preference vectors and item-specific criterion vectors are propagated in parallel without parameterized transforms, and fusion occurs at the prediction stage via elementwise addition and dot product. CPA-LGC demonstrates up to +141% improvement in Precision@5 over MC models, and 20–35% outperformance relative to naive LightGCN-based MC architectures (Park et al., 2023).

5. Light Multi-Order Convolution for Graph Classification

Beyond bipartite recommendation structures, LGC concepts have been incorporated into multi-hop GNNs for graph classification, e.g., “LiCheb” and “LiMixHop.” These variants conduct ei(k+1)=uNi1NiNueu(k)e_i^{(k+1)} = \sum_{u \in \mathcal N_i} \frac{1}{\sqrt{|\mathcal N_i|}\sqrt{|\mathcal N_u|}} e_u^{(k)}5-hop message aggregation using recursively computed kernels (ei(k+1)=uNi1NiNueu(k)e_i^{(k+1)} = \sum_{u \in \mathcal N_i} \frac{1}{\sqrt{|\mathcal N_i|}\sqrt{|\mathcal N_u|}} e_u^{(k)}6, ei(k+1)=uNi1NiNueu(k)e_i^{(k+1)} = \sum_{u \in \mathcal N_i} \frac{1}{\sqrt{|\mathcal N_i|}\sqrt{|\mathcal N_u|}} e_u^{(k)}7 or Laplacian, and so on) but replace the costly, parameter-heavy merging with a light-weight, channel-wise weight sharing mechanism. Outputs from all hops are summed over learned per-hop weights and a bias. This reduces overall parameter count from ei(k+1)=uNi1NiNueu(k)e_i^{(k+1)} = \sum_{u \in \mathcal N_i} \frac{1}{\sqrt{|\mathcal N_i|}\sqrt{|\mathcal N_u|}} e_u^{(k)}8 (Chebyshev) to ei(k+1)=uNi1NiNueu(k)e_i^{(k+1)} = \sum_{u \in \mathcal N_i} \frac{1}{\sqrt{|\mathcal N_i|}\sqrt{|\mathcal N_u|}} e_u^{(k)}9. Empirical results on PROTEINS, DD, NCI1/109, MUTAG, FRANK show that LiCheb and LiMixHop match or exceed the accuracy of their non-light analogs at roughly half the parameter cost (Gao et al., 2020).

6. Normalization, Over-Smoothing, and Ablation Analyses

Symmetric normalization—as in XRn×dX \in \mathbb R^{n \times d}0—is essential for stable aggregation and performance. Comparative analyses show that asymmetric or XRn×dX \in \mathbb R^{n \times d}1 normalizations are less effective (by up to 10% relative loss in Recall/NDCG).

Ablation studies confirm:

  • Removal of feature transforms (XRn×dX \in \mathbb R^{n \times d}2) and nonlinearities (XRn×dX \in \mathbb R^{n \times d}3) both benefit performance, and their combination maximizes gains (+12.6% Recall vs. standard NGCF).
  • Combining embeddings from all propagation layers (rather than using the XRn×dX \in \mathbb R^{n \times d}4-th layer only) addresses node over-smoothing, paralleling personalized PageRank and self-connected SGCN designs.
  • PairNorm and similar normalization methods, when used in MC settings, prevent embedding collapse in deep light GCN stacks.

7. Practical Considerations and Empirical Guidelines

  • Optimal propagation depth XRn×dX \in \mathbb R^{n \times d}5 is typically XRn×dX \in \mathbb R^{n \times d}6 or XRn×dX \in \mathbb R^{n \times d}7; larger XRn×dX \in \mathbb R^{n \times d}8 can induce over-smoothing in the absence of node features.
  • Uniform aggregation weights are sufficient in most cases; attention-based weighting yields marginal improvement (<1%).
  • For light multi-order convolution, information gain from additional hops decays exponentially: the practical value of XRn×dX \in \mathbb R^{n \times d}9 can be chosen based on a targeted percentage of retained information, commonly TkT_k0 or TkT_k1 for TkT_k2 coverage (Gao et al., 2020).
  • LGC-based methods offer highly competitive runtime and memory profiles, scaling linearly in the number of edges and not requiring dropout or extensive regularization.

Summary Table: LGC vs. Dense GCN (Recommendation)

Model Transforms & Activation Aggregation Parameterization Mean Rel. Impr. Recall@20/NDCG@20
NGCF W + nonlinearity Sum High
LightGCN Sum, TkT_k3 norm Minimal +16.5% / +16.4%

LGC methodology achieves competitive or superior accuracy with far less architectural complexity, challenging the paradigm that deeper nonlinear transformations are universally beneficial in graph learning tasks. Its extension to multi-criteria and multi-order domains further establishes LGC as a core principle in GNN design for large-scale, sparse, or ID-centric graphs (He et al., 2020, Gao et al., 2020, Park et al., 2023).

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Light Graph Convolution (LGC).