Posterior Marginalization in Signed Graphs
- The paper introduces a Bayesian framework that infers node labels by marginalizing over latent signed graphs to address structural uncertainty and heterophily.
- It implements a Sparse Signed Message Passing Network (SSMPN) using variational inference and sparse coding to selectively aggregate positive and negative neighbor information.
- Experimental results on heterophilic datasets demonstrate robust performance improvements in accuracy and memory efficiency compared to traditional graph models.
Posterior marginalization over signed graph structures involves inferring predictive distributions for node classification by averaging over possible realizations of graph connectivity, where each edge can be positive (homophilic), negative (heterophilic), or absent. This methodology directly addresses structural noise and heterophily by treating the graph as a latent signed structure with uncertainty, rather than relying on a fixed observed adjacency. The Sparse Signed Message Passing Network (SSMPN) operationalizes this approach within a Bayesian framework, integrating variational inference and sparse signal aggregation to achieve robust learning on unreliable, label-disassortative graphs (Choi et al., 3 Jan 2026).
1. Probabilistic Framework for Posterior Marginalization
The foundational model assumes an observed undirected graph with binary adjacency , interpreted as a noisy realization of an underlying signed adjacency . Each entry encodes the edge type: for supportive (homophilic), for antagonistic (heterophilic), and $0$ for absence.
A prior is placed independently over edge types for observed pairs:
Edges not in are fixed to .
Given node features and a labeled training set , the Bayes-optimal prediction is
Marginalizing over is intractable. Thus, an amortized variational posterior with categorical marginals approximates . The marginals are parameterized using a two-layer GCN encoder, MLP decoder, and softmax layer.
2. Sparse Signed Message Passing Layer
Message aggregation is performed by conditioning on sampled signed adjacency . Let denote node embeddings; the neighbor "dictionary" and target vector are constructed for each node. Neighborhoods
divide neighbors into positive and negative classes.
Sparse coding is performed by solving, for each node ,
where comprises for neighbors . Aggregation distinguishes attractive and repulsive effects:
modulates repulsion from negative edges. This signed aggregation, combined with sparse coding, adaptively suppresses noisy or irrelevant neighbors.
3. Variational Training Objective
Learning is performed by maximizing the evidence lower bound (ELBO). The negative structural ELBO loss is
A sparsity-promoting regularizer enforces neighbor selection:
where are Gumbel-softmax samples from . The total loss aggregates classification, sparsity, and structural regularization terms:
4. Training and Inference Algorithm
The full training loop consists of the following steps per mini-batch:
- Encode via GCN and MLP, yielding edge marginals .
- For :
- Sample via Gumbel-softmax reparameterization.
- Forward-propagate through sparse signed message passing (S) layers, solving the neighborwise LASSO for , and compute final embeddings and logits .
- Predictive node label distributions are approximated as
- Compute supervised loss , add sparsity and structural penalties, and update .
- During inference, sample signed adjacencies , aggregate predictions, and output averaged results.
5. Handling Structural Uncertainty and Heterophily
By maintaining an explicit posterior over signed structures, posterior marginalization achieves a Bayes-optimal ensemble: excess risk is bounded by (Theorem 1, (Choi et al., 3 Jan 2026)). Signed aggregation differentially contracts or expands class representations, with positive edges enforcing similarity and negative edges enforcing separation—provably increasing inter-class distances under standard stochastic block models (Theorem 3). Sparse coding instantiates a locally MAP estimator for neighbor contributions, under a Gaussian–Laplace prior, thereby limiting the influence of noisy or structurally ambiguous neighbors.
This framework inherently supports robustness to both edge noise and heterophily, as the model can leverage supporting and opposing relations and adaptively select the most informative subset of neighbors.
6. Experimental Validation under Structural Noise
Experiments encompass nine heterophilic benchmarks (RomanEmpire, Minesweeper, AmazonRatings, Chameleon, Squirrel, Actor, Cornell, Texas, Wisconsin) with homophily ratios as low as 0.03. Across these datasets and relative to heterophily-aware and spectral baseline models (HGCN, GPRGNN, FAGCN, DirGNN, L2DGCN), SSMPN consistently ranks best or in the top-three for accuracy (e.g., 75.0% on RomanEmpire vs. 70.3% by CGNN; 83.8% on Texas vs. 76.7% by L2DGCN).
Robustness studies on the Texas dataset show that random edge deletions up to 60% result in under 10% performance drop for SSMPN, compared to over 20% degradation for GCN/GAT. Tests with Gaussian feature noise and adversarial edge perturbations indicate that sparse signed aggregation constrains oversmoothing and error amplification.
Evaluation on large-scale heterophilic graphs (Penn94, arXiv-year, snap-patents) demonstrates memory efficiency and performance improvements of 4–8 points over baselines. Competing architectures such as GCNII and HGCN either exhibit performance collapse or run out of memory on the largest graph.
7. Significance and Implications
Posterior marginalization over signed graph structures, as instantiated by SSMPN, provides a principled Bayesian methodology for graph learning under uncertainty and heterophily. Its explicit modeling of structural ambiguity and relation polarity outperforms fixed-structure and naive regularization approaches on noisy and disassortative graphs. The integration of variational Bayesian inference and sparse signed message passing enables scalability, selective neighbor utilization, and resistance to oversmoothing, establishing a new standard for robust semi-supervised node classification under structural uncertainty (Choi et al., 3 Jan 2026).