Papers
Topics
Authors
Recent
2000 character limit reached

Posterior Marginalization in Signed Graphs

Updated 10 January 2026
  • The paper introduces a Bayesian framework that infers node labels by marginalizing over latent signed graphs to address structural uncertainty and heterophily.
  • It implements a Sparse Signed Message Passing Network (SSMPN) using variational inference and sparse coding to selectively aggregate positive and negative neighbor information.
  • Experimental results on heterophilic datasets demonstrate robust performance improvements in accuracy and memory efficiency compared to traditional graph models.

Posterior marginalization over signed graph structures involves inferring predictive distributions for node classification by averaging over possible realizations of graph connectivity, where each edge can be positive (homophilic), negative (heterophilic), or absent. This methodology directly addresses structural noise and heterophily by treating the graph as a latent signed structure with uncertainty, rather than relying on a fixed observed adjacency. The Sparse Signed Message Passing Network (SSMPN) operationalizes this approach within a Bayesian framework, integrating variational inference and sparse signal aggregation to achieve robust learning on unreliable, label-disassortative graphs (Choi et al., 3 Jan 2026).

1. Probabilistic Framework for Posterior Marginalization

The foundational model assumes an observed undirected graph Gobs=(V,Eobs)\mathcal G_{\mathrm{obs}} = (\mathcal V, \mathcal E_{\mathrm{obs}}) with binary adjacency Aobs{0,1}n×nA_{\mathrm{obs}}\in\{0,1\}^{n\times n}, interpreted as a noisy realization of an underlying signed adjacency Z{1,0,+1}n×nZ\in\{-1,0,+1\}^{n\times n}. Each entry zijz_{ij} encodes the edge type: +1+1 for supportive (homophilic), 1-1 for antagonistic (heterophilic), and $0$ for absence.

A prior is placed independently over edge types for observed pairs:

p(Z)=(i,j)Eobsp(zij),p(zij=s)=πs0, s{1,0,+1}p(Z) = \prod_{(i,j)\in\mathcal E_{\mathrm{obs}}} p(z_{ij}), \quad p(z_{ij}=s) = \pi_s^0, \ s\in\{-1,0,+1\}

Edges not in Eobs\mathcal E_{\mathrm{obs}} are fixed to zij=0z_{ij}=0.

Given node features XX and a labeled training set YLY_{\mathcal L}, the Bayes-optimal prediction is

p(yiX,Aobs,YL)=EZp(Aobs,X,YL)[p(yiX,Z)]p^\star(y_i \mid X, A_{\mathrm{obs}}, Y_{\mathcal L}) = \mathbb{E}_{Z\sim p(\cdot\,|\,A_{\mathrm{obs}},X,Y_{\mathcal L})}\left[p(y_i \mid X, Z)\right]

Marginalizing over p(Z)p(Z\,|\,\cdot) is intractable. Thus, an amortized variational posterior qϕ(ZAobs,X,YL)=(i,j)Eobsqϕ(zij)q_\phi(Z\,|\,A_{\mathrm{obs}}, X, Y_{\mathcal L})=\prod_{(i,j)\in\mathcal E_{\mathrm{obs}}}q_\phi(z_{ij}\mid\cdot) with categorical marginals πijs\pi_{ij}^s approximates p(Z)p(Z\,|\,\cdot). The marginals are parameterized using a two-layer GCN encoder, MLP decoder, and softmax layer.

2. Sparse Signed Message Passing Layer

Message aggregation is performed by conditioning on sampled signed adjacency ZqϕZ\sim q_\phi. Let HRn×dinH\in\mathbb R^{n\times d_{\mathrm{in}}} denote node embeddings; the neighbor "dictionary" V=HWvV = HW_v and target vector ti=Wthit_i = W_t h_i are constructed for each node. Neighborhoods

Ni+(Z)={j:zij=+1},Ni(Z)={j:zij=1}\mathcal N^+_i(Z) = \{j: z_{ij}=+1\}, \quad \mathcal N^-_i(Z) = \{j: z_{ij}=-1\}

divide neighbors into positive and negative classes.

Sparse coding is performed by solving, for each node ii,

αi=argminαtiViα22+λα1\alpha_i^\star = \arg\min_\alpha \| t_i - V_i \alpha \|_2^2 + \lambda\|\alpha\|_1

where ViV_i comprises vjv_j for neighbors jNi(Z)j \in \mathcal N_i(Z). Aggregation distinguishes attractive and repulsive effects:

hi=Wo(jNi+αij+vjγjNiαijvj)+bh_i' = W_o \left( \sum_{j\in \mathcal N^+_i} \alpha_{ij}^+ v_j - \gamma \sum_{j\in \mathcal N^-_i}\lvert\alpha_{ij}^-\rvert v_j \right) + b

γ\gamma modulates repulsion from negative edges. This signed aggregation, combined with sparse coding, adaptively suppresses noisy or irrelevant neighbors.

3. Variational Training Objective

Learning is performed by maximizing the evidence lower bound (ELBO). The negative structural ELBO loss is

Lstruct(ϕ)=KL(qϕ(ZAobs,X,YL)p(Z))EZqϕ[logp(AobsZ)]\mathcal L_{\mathrm{struct}}(\phi) = \mathrm{KL}(q_\phi(Z\,|\,A_{\mathrm{obs}},X,Y_{\mathcal L})\,\|\,p(Z)) - \mathbb{E}_{Z\sim q_\phi}[\log p(A_{\mathrm{obs}}\,|\,Z)]

A sparsity-promoting regularizer enforces neighbor selection:

Lsparse(θ)=1ni=1nEZqϕ[αi(Z)1]1nKk=1Ki=1nαi(Z(k))1\mathcal L_{\mathrm{sparse}}(\theta) = \frac{1}{n}\sum_{i=1}^n \mathbb{E}_{Z\sim q_\phi}[\|\alpha_i(Z)\|_1] \approx \frac{1}{nK}\sum_{k=1}^K \sum_{i=1}^n \|\alpha_i(Z^{(k)})\|_1

where Z(k)Z^{(k)} are Gumbel-softmax samples from qϕq_\phi. The total loss aggregates classification, sparsity, and structural regularization terms:

Ltotal=Lcls+λspLsparse+λstLstruct\mathcal L_{\mathrm{total}} = \mathcal L_{\mathrm{cls}} + \lambda_{\mathrm{sp}}\,\mathcal L_{\mathrm{sparse}} + \lambda_{\mathrm{st}}\,\mathcal L_{\mathrm{struct}}

4. Training and Inference Algorithm

The full training loop consists of the following steps per mini-batch:

  1. Encode (Aobs,X,YL)(A_{\mathrm{obs}}, X, Y_{\mathcal L}) via GCNϕ_\phi and MLP, yielding edge marginals πijs\pi_{ij}^s.
  2. For k=1,,Kk=1, \ldots, K:
    • Sample Z(k)qϕZ^{(k)} \sim q_\phi via Gumbel-softmax reparameterization.
    • Forward-propagate through LL sparse signed message passing (S2^2) layers, solving the neighborwise LASSO for αi(k)\alpha_i^{(k)}, and compute final embeddings and logits i(k)\ell_i^{(k)}.
  3. Predictive node label distributions are approximated as

p^θ(yiX,Aobs)=1Kk=1Ksoftmax(i(k))\hat p_\theta(y_i \mid X, A_{\mathrm{obs}}) = \frac{1}{K}\sum_{k=1}^K \mathrm{softmax}(\ell_i^{(k)})

  1. Compute supervised loss Lcls\mathcal L_{\mathrm{cls}}, add sparsity and structural penalties, and update (θ,ϕ)(\theta, \phi).
  2. During inference, sample KK signed adjacencies Z(k)Z^{(k)}, aggregate predictions, and output averaged results.

5. Handling Structural Uncertainty and Heterophily

By maintaining an explicit posterior qϕ(Z)q_\phi(Z) over signed structures, posterior marginalization achieves a Bayes-optimal ensemble: excess risk is bounded by qϕp1\|q_\phi - p\|_1 (Theorem 1, (Choi et al., 3 Jan 2026)). Signed aggregation differentially contracts or expands class representations, with positive edges enforcing similarity and negative edges enforcing separation—provably increasing inter-class distances under standard stochastic block models (Theorem 3). Sparse coding instantiates a locally MAP estimator for neighbor contributions, under a Gaussian–Laplace prior, thereby limiting the influence of noisy or structurally ambiguous neighbors.

This framework inherently supports robustness to both edge noise and heterophily, as the model can leverage supporting and opposing relations and adaptively select the most informative subset of neighbors.

6. Experimental Validation under Structural Noise

Experiments encompass nine heterophilic benchmarks (RomanEmpire, Minesweeper, AmazonRatings, Chameleon, Squirrel, Actor, Cornell, Texas, Wisconsin) with homophily ratios as low as 0.03. Across these datasets and relative to heterophily-aware and spectral baseline models (H2_2GCN, GPRGNN, FAGCN, DirGNN, L2DGCN), SSMPN consistently ranks best or in the top-three for accuracy (e.g., 75.0% on RomanEmpire vs. 70.3% by CGNN; 83.8% on Texas vs. 76.7% by L2DGCN).

Robustness studies on the Texas dataset show that random edge deletions up to 60% result in under 10% performance drop for SSMPN, compared to over 20% degradation for GCN/GAT. Tests with Gaussian feature noise and adversarial edge perturbations indicate that sparse signed aggregation constrains oversmoothing and error amplification.

Evaluation on large-scale heterophilic graphs (Penn94, arXiv-year, snap-patents) demonstrates memory efficiency and performance improvements of 4–8 points over baselines. Competing architectures such as GCNII and H2_2GCN either exhibit performance collapse or run out of memory on the largest graph.

7. Significance and Implications

Posterior marginalization over signed graph structures, as instantiated by SSMPN, provides a principled Bayesian methodology for graph learning under uncertainty and heterophily. Its explicit modeling of structural ambiguity and relation polarity outperforms fixed-structure and naive regularization approaches on noisy and disassortative graphs. The integration of variational Bayesian inference and sparse signed message passing enables scalability, selective neighbor utilization, and resistance to oversmoothing, establishing a new standard for robust semi-supervised node classification under structural uncertainty (Choi et al., 3 Jan 2026).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to Posterior Marginalization Over Signed Graph Structures.