Light Graph Convolution (LGC)

Updated 7 April 2026

Light Graph Convolution (LGC) is a streamlined graph neural network that removes non-linear activations and feature transformations to retain only normalized neighborhood aggregation.
It achieves state-of-the-art performance in recommendation systems by efficiently combining multi-layer embeddings, yielding significant improvements such as a +16.5% gain in Recall@20 over deeper models.
LGC extends to multi-criteria recommendation and graph classification while mitigating over-smoothing through uniform aggregation and essential normalization techniques.

Light Graph Convolution (LGC) is a class of graph neural network (GNN) architectures designed to deliver expressive, scalable, and efficient message-passing operations for graph-based learning tasks while minimizing architectural complexity. The defining characteristic of LGC is aggressive simplification: LGC models eschew both nonlinear activation functions and learnable feature transformation matrices within each convolutional layer, retaining only neighborhood aggregation with normalization. This approach has demonstrated state-of-the-art performance in collaborative filtering (CF), multi-criteria recommendation, and graph classification, establishing that, for many graph domains, “less is more.”

1. Theoretical Foundations and Core Design

Standard GCN layers typically consist of three operations: feature transformation (multiplication by learnable matrices), neighborhood aggregation, and application of pointwise nonlinear activation (e.g., ReLU). LGC deliberately removes both feature transformation and nonlinear activation, retaining only normalized linear neighborhood aggregation. This design is motivated by empirical findings that, in collaborative filtering graphs where nodes are represented as ID embeddings without attribute features, both transformations and nonlinearities contribute little and may even degrade performance. The result is a highly streamlined layer structure directly aggregating information from normalized neighbors, followed by linear combination of multiple layers' outputs to form the final node representation (He et al., 2020, Gao et al., 2020, Park et al., 2023).

2. Mathematical Formulation of LGC Layers

A canonical LGC layer propagates embeddings as follows:

For user–item bipartite graphs:
- User nodes:
$e_u^{(k+1)} = \sum_{i \in \mathcal N_u} \frac{1}{\sqrt{|\mathcal N_u|}\sqrt{|\mathcal N_i|}} e_i^{(k)}$ - Item nodes:

$e_i^{(k+1)} = \sum_{u \in \mathcal N_i} \frac{1}{\sqrt{|\mathcal N_i|}\sqrt{|\mathcal N_u|}} e_u^{(k)}$

For general graphs with node features $X \in \mathbb R^{n \times d}$ , multi-hop kernels $T_k$ can be used for $k$ -order aggregation:

$Z^k = T_k X W$

where $W$ is either omitted entirely (as in pure LGC) or shared across all hops in "light" multi-order extensions (Gao et al., 2020).

After $K$ layers, the final node embedding is formed as a (typically uniform) weighted sum over all propagation layers:

$e^* = \sum_{k=0}^K \alpha_k e^{(k)}$

with $\alpha_k \geq 0$ , $e_i^{(k+1)} = \sum_{u \in \mathcal N_i} \frac{1}{\sqrt{|\mathcal N_i|}\sqrt{|\mathcal N_u|}} e_u^{(k)}$ 0. For most settings, uniform weighting $e_i^{(k+1)} = \sum_{u \in \mathcal N_i} \frac{1}{\sqrt{|\mathcal N_i|}\sqrt{|\mathcal N_u|}} e_u^{(k)}$ 1 suffices.

The prediction score in recommendation is given by the inner product $e_i^{(k+1)} = \sum_{u \in \mathcal N_i} \frac{1}{\sqrt{|\mathcal N_i|}\sqrt{|\mathcal N_u|}} e_u^{(k)}$ 2. In MC recommendation, additional channels model user- and criterion-specific preferences (Park et al., 2023).

3. Algorithmic Structure and Training Procedures

A representative training loop for LGC, as per LightGCN, comprises:

Initialization of ID embeddings for nodes.
Precomputing degree matrices and normalized adjacency for efficient propagation.
Batchwise propagation of embeddings across $e_i^{(k+1)} = \sum_{u \in \mathcal N_i} \frac{1}{\sqrt{|\mathcal N_i|}\sqrt{|\mathcal N_u|}} e_u^{(k)}$ 3 layers with pure aggregation (sparse matrix multiplication).
Weighted combination of layer-wise embeddings.
Scoring user–item (or node–node) pairs via inner product.
Bayesian Personalized Ranking (BPR) loss:

$e_i^{(k+1)} = \sum_{u \in \mathcal N_i} \frac{1}{\sqrt{|\mathcal N_i|}\sqrt{|\mathcal N_u|}} e_u^{(k)}$ 4

where only the initial embeddings are regularized.

Backpropagation only with respect to the initial embeddings (since layers have no parameters).

Pseudocode for such an LGC training regime is detailed in (He et al., 2020).

4. LGC in Collabrative Filtering and Multi-Criteria Recommendation

In collaborative filtering, LGC (as instantiated by LightGCN) substantially outperforms deeper GCNs such as NGCF (Neural Graph Collaborative Filtering), Mult-VAE, and Laplacian-regularized MF frameworks. Across large benchmarks (Gowalla, Yelp2018, Amazon-Book), LightGCN achieves a +16.5% average improvement in Recall@20 and NDCG@20 over NGCF (He et al., 2020).

For multi-criteria recommendation, CPA-LGC extends the LGC design to expanded bipartite MC graphs, incorporating two embedding channels per node: the standard structural embeddings (propagated via LGC) and user/item-specific criteria preference embeddings. The MC expansion graph includes user nodes, item-criterion nodes, and weighted edges reflecting multi-criteria ratings. User-specific preference vectors and item-specific criterion vectors are propagated in parallel without parameterized transforms, and fusion occurs at the prediction stage via elementwise addition and dot product. CPA-LGC demonstrates up to +141% improvement in Precision@5 over MC models, and 20–35% outperformance relative to naive LightGCN-based MC architectures (Park et al., 2023).

5. Light Multi-Order Convolution for Graph Classification

Beyond bipartite recommendation structures, LGC concepts have been incorporated into multi-hop GNNs for graph classification, e.g., “LiCheb” and “LiMixHop.” These variants conduct $e_i^{(k+1)} = \sum_{u \in \mathcal N_i} \frac{1}{\sqrt{|\mathcal N_i|}\sqrt{|\mathcal N_u|}} e_u^{(k)}$ 5-hop message aggregation using recursively computed kernels ( $e_i^{(k+1)} = \sum_{u \in \mathcal N_i} \frac{1}{\sqrt{|\mathcal N_i|}\sqrt{|\mathcal N_u|}} e_u^{(k)}$ 6, $e_i^{(k+1)} = \sum_{u \in \mathcal N_i} \frac{1}{\sqrt{|\mathcal N_i|}\sqrt{|\mathcal N_u|}} e_u^{(k)}$ 7 or Laplacian, and so on) but replace the costly, parameter-heavy merging with a light-weight, channel-wise weight sharing mechanism. Outputs from all hops are summed over learned per-hop weights and a bias. This reduces overall parameter count from $e_i^{(k+1)} = \sum_{u \in \mathcal N_i} \frac{1}{\sqrt{|\mathcal N_i|}\sqrt{|\mathcal N_u|}} e_u^{(k)}$ 8 (Chebyshev) to $e_i^{(k+1)} = \sum_{u \in \mathcal N_i} \frac{1}{\sqrt{|\mathcal N_i|}\sqrt{|\mathcal N_u|}} e_u^{(k)}$ 9. Empirical results on PROTEINS, DD, NCI1/109, MUTAG, FRANK show that LiCheb and LiMixHop match or exceed the accuracy of their non-light analogs at roughly half the parameter cost (Gao et al., 2020).

6. Normalization, Over-Smoothing, and Ablation Analyses

Symmetric normalization—as in $X \in \mathbb R^{n \times d}$ 0—is essential for stable aggregation and performance. Comparative analyses show that asymmetric or $X \in \mathbb R^{n \times d}$ 1 normalizations are less effective (by up to 10% relative loss in Recall/NDCG).

Ablation studies confirm:

Removal of feature transforms ( $X \in \mathbb R^{n \times d}$ 2) and nonlinearities ( $X \in \mathbb R^{n \times d}$ 3) both benefit performance, and their combination maximizes gains (+12.6% Recall vs. standard NGCF).
Combining embeddings from all propagation layers (rather than using the $X \in \mathbb R^{n \times d}$ 4-th layer only) addresses node over-smoothing, paralleling personalized PageRank and self-connected SGCN designs.
PairNorm and similar normalization methods, when used in MC settings, prevent embedding collapse in deep light GCN stacks.

7. Practical Considerations and Empirical Guidelines

Optimal propagation depth $X \in \mathbb R^{n \times d}$ 5 is typically $X \in \mathbb R^{n \times d}$ 6 or $X \in \mathbb R^{n \times d}$ 7; larger $X \in \mathbb R^{n \times d}$ 8 can induce over-smoothing in the absence of node features.
Uniform aggregation weights are sufficient in most cases; attention-based weighting yields marginal improvement (<1%).
For light multi-order convolution, information gain from additional hops decays exponentially: the practical value of $X \in \mathbb R^{n \times d}$ 9 can be chosen based on a targeted percentage of retained information, commonly $T_k$ 0 or $T_k$ 1 for $T_k$ 2 coverage (Gao et al., 2020).
LGC-based methods offer highly competitive runtime and memory profiles, scaling linearly in the number of edges and not requiring dropout or extensive regularization.

Summary Table: LGC vs. Dense GCN (Recommendation)

Model	Transforms & Activation	Aggregation	Parameterization	Mean Rel. Impr. Recall@20/NDCG@20
NGCF	W + nonlinearity	Sum	High	–
LightGCN	–	Sum, $T_k$ 3 norm	Minimal	+16.5% / +16.4%

LGC methodology achieves competitive or superior accuracy with far less architectural complexity, challenging the paradigm that deeper nonlinear transformations are universally beneficial in graph learning tasks. Its extension to multi-criteria and multi-order domains further establishes LGC as a core principle in GNN design for large-scale, sparse, or ID-centric graphs (He et al., 2020, Gao et al., 2020, Park et al., 2023).

Markdown Report Issue Upgrade to Chat

References (3)

LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation (2020)

LookHops: light multi-order convolution and pooling for graph classification (2020)

Criteria Tell You More than Ratings: Criteria Preference-Aware Light Graph Convolution for Effective Multi-Criteria Recommendation (2023)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Light Graph Convolution (LGC).

Light Graph Convolution (LGC)

1. Theoretical Foundations and Core Design

2. Mathematical Formulation of LGC Layers

3. Algorithmic Structure and Training Procedures

4. LGC in Collabrative Filtering and Multi-Criteria Recommendation

5. Light Multi-Order Convolution for Graph Classification

6. Normalization, Over-Smoothing, and Ablation Analyses

7. Practical Considerations and Empirical Guidelines

Summary Table: LGC vs. Dense GCN (Recommendation)

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Light Graph Convolution (LGC)

1. Theoretical Foundations and Core Design

2. Mathematical Formulation of LGC Layers

3. Algorithmic Structure and Training Procedures

4. LGC in Collabrative Filtering and Multi-Criteria Recommendation

5. Light Multi-Order Convolution for Graph Classification

6. Normalization, Over-Smoothing, and Ablation Analyses

7. Practical Considerations and Empirical Guidelines

Summary Table: LGC vs. Dense GCN (Recommendation)

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research