Papers
Topics
Authors
Recent
Search
2000 character limit reached

Fine-Grained Path Augmentation (FGPA)

Updated 18 March 2026
  • FGPA is a framework for graph gradual domain adaptation that constructs intermediate graphs by interpolating source and target domains using the fused Gromov–Wasserstein distance.
  • It employs a T-stage self-training regime with confidence weighting to mitigate error accumulation along the optimal geodesic path.
  • FGPA integrates with existing graph adaptation methods and demonstrates empirical gains on real-world datasets such as Airport, Citation, and Social graphs.

Fine-Grained Path Augmentation (FGPA) is a framework for graph gradual domain adaptation (GDA), specifically designed to address the challenge of large, non-independent-and-identically-distributed (non-IID) shifts between graph domains. FGPA constructs an optimal sequence of intermediate graphs connecting source and target domains via a geodesic under the Fused Gromov–Wasserstein (FGW) distance and augments self-training with confidence weighting along this interpolated trajectory. This methodology is model-agnostic and compatible with standard graph domain adaptation (graph DA) penalties, enabling robust adaptation in settings where prior approaches assuming mild shifts or given paths are inadequate (2505.12709).

1. Fused Gromov–Wasserstein Distance in Graph Domain Adaptation

FGPA leverages the FGW distance for measuring graph discrepancy. Given attributed graphs G0=(V0,A0,X0)G_0 = (V_0, A_0, X_0) and G1=(V1,A1,X1)G_1 = (V_1, A_1, X_1), with node-weight histograms μ0ΔV0\mu_0 \in \Delta_{|V_0|}, μ1ΔV1\mu_1 \in \Delta_{|V_1|}, cross-graph feature cost M(i,j)=X0(i)X1(j)XqM(i,j) = \|X_0(i) - X_1(j)\|_{\mathcal{X}}^q, and adjacency matrices C0=A0C_0 = A_0, C1=A1C_1 = A_1, the FGW distance of order qq with trade-off α\alpha is: dFGW;q,α(G0,G1)=minSΠ(μ0,μ1)[(1α)i,jM(i,j)S(i,j)+αi,i,j,jC0(i,i)C1(j,j)qS(i,j)S(i,j)]1/qd_{\mathrm{FGW};q,\alpha}(G_0, G_1) = \min_{S\in\Pi(\mu_0,\mu_1)} \Bigl[ (1-\alpha)\sum_{i,j} M(i, j) S(i, j) + \alpha \sum_{i,i',j,j'} |C_0(i,i') - C_1(j,j')|^q S(i,j)S(i',j') \Bigr]^{1/q} This formulation unifies node attribute and topological discrepancies and serves as the foundation for quantifying domain shift in the adaptation process. In practical settings, q=2q=2 and α=0.5\alpha=0.5 are typical choices (2505.12709).

2. Theoretical Framework and Error Bound in Gradual Adaptation

FGPA follows a TT-stage self-training regime, denoted by a sequence H0=G0,HT=G1H_0 = G_0, H_T = G_1, where stage tt involves adapting model ft1f_{t-1} to HtH_t using pseudo-labels. Theoretical analysis (Theorem 3.2) shows that, under standard Lipschitz/Hölder conditions on the per-node loss and GNN architecture, the accumulated error on the target domain satisfies: ξ(fT,G1)ξ(f0,G0)+CδT+Ct=1TdFGWq(Ht1,Ht)\xi(f_T, G_1) \leq \xi(f_0, G_0) + C_\ell \delta T + C \sum_{t=1}^T d_{\mathrm{FGW}}^q(H_{t-1}, H_t) where ξ(f,G)\xi(f, G) is the empirical risk, CC_\ell and CC encapsulate model smoothness constants, and δ\delta bounds self-training error. This decomposition clarifies the trade-off between the number of adaptation stages and both self-training accumulation and the total geodesic length traversed in FGW metric (2505.12709).

3. Construction and Properties of the FGW Geodesic Path

The optimal adaptation path is characterized as the FGW geodesic between G0G_0 and G1G_1. By Jensen’s inequality and the metric properties of FGW, the cumulative discrepancy t=1TdFGWq(Ht1,Ht)\sum_{t=1}^T d_{\mathrm{FGW}}^q(H_{t-1}, H_t) is minimized when each HtH_t lies at t/Tt/T along the interpolating curve: G(λ)=(V0V1,  (1λ)A~0+λA~1,  (1λ)X~0+λX~1)G(\lambda) = \bigl(V_0 \otimes V_1,\; (1-\lambda)\widetilde{A}_0 + \lambda \widetilde{A}_1,\; (1-\lambda)\widetilde{X}_0 + \lambda \widetilde{X}_1\bigr) for λ[0,1]\lambda \in [0,1], where A~0,X~0,A~1,X~1\widetilde{A}_0, \widetilde{X}_0, \widetilde{A}_1, \widetilde{X}_1 are re-arranged adjacency/features in the product node space via optimal low-rank transformations derived from the FGW optimal coupling. Algorithmically, low-rank optimal transport with Dykstra updates is used to find the coupling, after which intermediate graphs are generated by linear interpolation (2505.12709).

4. Self-Training along the FGW Geodesic with Confidence Modulation

FGPA performs adaptation via self-training on each intermediate graph. At each stage, predictions from ft1f_{t-1} form pseudo-labels for HtH_t. Confidence scores are calculated for each node using normalized entropy: conf(y^i)=maxjent(y^j)ent(y^i)maxjent(y^j)minjent(y^j)\mathrm{conf}(\hat y_i) = \frac{\max_j \mathrm{ent}(\hat y_j) - \mathrm{ent}(\hat y_i)} {\max_j \mathrm{ent}(\hat y_j) - \min_j \mathrm{ent}(\hat y_j)} where ent()\mathrm{ent}(\cdot) denotes Shannon entropy. These confidences are employed to down-weight noisy predictions in the stage-wise supervised loss: ft=argminfuV(Ht)conf(y^u)  (f(Ht)u,y^u)f_t = \arg\min_f \sum_{u \in V(H_t)} \mathrm{conf}(\hat y_u) \; \ell(f(H_t)_u, \hat y_u) This denoising mechanism counteracts self-training noise accumulation inherent in GDA (2505.12709).

5. Integration with Existing Graph DA Workflows

FGPA is orthogonal to specific domain adaptation loss choices and can be used in conjunction with existing graph DA techniques such as MMD, CORAL, AdaGCN, GRADE, StruRW, adversarial or spectral regularizations. The total loss at each stage can incorporate an arbitrary graph DA penalty: Ltotal=selftrain(ft;Ht)+λLDA(ft;Ht,Ht1)L_{\rm total} = \ell_{\rm selftrain}(f_t; H_t) + \lambda L_{\rm DA}(f_t; H_t, H_{t-1}) FGPA thus serves as a path augmentation module, enhancing any base DA method with fine-grained geodesic trajectory information, rather than altering the adaptation loss structure itself. This modularity is a distinguishing feature of the approach (2505.12709).

6. Empirical Evaluation and Results

FGPA has been evaluated on diverse datasets, including real-world graphs—Airport (USA⇄Europe⇄Brazil), Citation (ACM⇄DBLP), Social (Blog1⇄Blog2)—and synthetic contextual SBMs with controlled shifts. Backbone architectures include 2–3 layer GCNs or APPNP with hidden dimensions 8–16. Standard adaptation baselines (ERM, MMD, CORAL, AdaGCN, GRADE, StruRW) serve as comparators.

Key experimental settings and results:

Dataset Average Gain (pp) Max Gain (pp) Synthetic Gain (pp)
Airport +6.8 +26.3
Social +3.6
Citation +3.4
CSBM (synthetic) +36.5
  • Geodesic path construction: T=3T=3 intermediate graphs, q=2q=2, α=0.5\alpha=0.5, low-rank OT of rank r=0.25Vr = 0.25|V|.
  • Training: Adam optimizer, learning rate 5×1025 \times 10^{-2}, 1000 epochs, 5 random seeds.

FGPA improved node classification accuracy over direct adaptation (one-shot baseline) in over 90% of real-world domain adaptation tasks, with worst-case degradation under 2.6 percentage points in cases of mild shift (2505.12709).

7. Practical Implications and Future Directions

FGPA establishes a principled framework for generating and exploiting intermediate domains in the space of attributed graphs under severe non-IID shifts. By leveraging the optimality properties of the FGW geodesic and the flexibility of confidence-modulated self-training, it addresses robustness issues in graph DA scenarios where traditional metrics and heuristics fail. A plausible implication is the extension of this pipeline to other structured data modalities that admit Gromov–Wasserstein-type interpolations. Further, the plug-and-play compatibility with arbitrary domain adaptation losses supports a modular ecosystem for future graph adaptation research.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Fine-Grained Path Augmentation (FGPA).