Graph Distribution-Valued Signals

Updated 2 October 2025

Graph Distribution-Valued Signals are an extension of graph signal processing that models each node's signal as a probability distribution to capture uncertainty and variability.
The framework generalizes classical spectral methods using concepts like the Wasserstein metric and pushforward operations for robust filtering and prediction.
It enables effective handling of incomplete, noisy, or misaligned data and supports advanced filter learning and statistical modeling in complex networks.

Graph Distribution-Valued Signals (GDSs) are a modern extension of classical graph signal processing (GSP) in which the signal at each node, or the global signal associated to a graph, is modeled as a probability distribution—typically in the Wasserstein space—rather than a fixed vector. This paradigm enables a rigorous and unified approach for handling uncertainties, stochasticity, incomplete observations, and misalignments that are prevalent in practical graph-structured data. The GDS framework preserves the core architecture of classical GSP while providing systematic generalizations for spectral analysis, filtering, and prediction in a measure-theoretic setting.

1. Mathematical Foundations: From Vector Signals to Distribution-Valued Signals

In classical GSP, a graph signal is typically modeled as a vector $x \in \mathbb{R}^N$ , where $N$ is the number of nodes. The GDS framework generalizes this notion by replacing $x$ with a probability measure $\mu \in \mathcal{P}_p(\mathbb{R}^N)$ , the Wasserstein space of Borel probability measures on $\mathbb{R}^N$ with finite $p$ th moments (Zhao et al., 30 Sep 2025, Ji et al., 2023). This space is metrized by the $p$ -Wasserstein distance: $W_p(\mu_1, \mu_2) = \left( \inf_{\gamma\in \Gamma(\mu_1, \mu_2)} \int_{\mathbb{R}^N \times \mathbb{R}^N} \|x - y\|_2^p \, d\gamma(x, y) \right)^{1/p}$ where $\Gamma(\mu_1, \mu_2)$ is the set of all couplings with marginals $\mu_1$ and $\mu_2$ .

A deterministic vector signal corresponds to a Dirac measure $\delta_x$ , isometrically embedded in $\mathcal{P}_p(\mathbb{R}^N)$ (i.e., $W_p(\delta_x, \delta_y) = \|x-y\|_2$ ). This identification means GDSs strictly generalize classical graph signals.

The statistical modeling power of the GDS framework allows each node’s "value" to be modeled as part of a joint probability law, capturing both uncertainty (e.g., sensor noise, missing data) and intrinsic variability (e.g., stochastic processes). Practical constructions often exploit parametric distributions (such as Gaussians), possibly with covariances modeling inter-node dependencies (Zhao et al., 30 Sep 2025, Ji et al., 2023).

2. Core GSP Operations in the Distributional Setting

The GDS paradigm systematically generalizes classical GSP operations by adopting measure-theoretic notions of pushforward (distribution transport) induced by graph-based linear operators:

Fourier Transform: If $U^{\mathcal{G}}$ is the eigenbasis of the graph shift operator (e.g., Laplacian or adjacency matrix), the classical GFT of $x$ is $U^{\mathcal{G}^\top} x$ . The GDS-Fourier Transform (GDS-FT) is the pushforward measure,

$\mu \mapsto \widehat{\mu} = (U^{\mathcal{G}^\top})_{\#}\mu,$

meaning $\widehat{\mu}(B) = \mu((U^{\mathcal{G}^\top})^{-1}(B))$ for Borel $B \subseteq \mathbb{R}^N$ (Zhao et al., 30 Sep 2025).

Graph Filtering: Given a filter $F^{\mathcal{G}}$ (typically a polynomial in the shift), filtering a GDS is given by the pushforward,

$\mu \mapsto (F^{\mathcal{G}})_{\#}\mu.$

For $\mu=\delta_x$ , this is $F^{\mathcal{G}}x$ , thereby recovering classical filtering.

These mappings preserve statistical structure; for example, if $\mu$ is Gaussian $\mathcal{N}(m, \Sigma)$ , then $(F^{\mathcal{G}})_{\#}\mu$ is $\mathcal{N}(F^{\mathcal{G}}m, F^{\mathcal{G}}\Sigma F^{\mathcal{G}}^\top)$ (Zhao et al., 30 Sep 2025, Ji et al., 2023).
Convolutional Filtering in the Absence of a Fixed Graph: When the topology itself is random or not precisely known, the map $(F^{\mathcal{G}})_{\#}$ can incorporate, via a distribution over $F^{\mathcal{G}}$ , network structure uncertainty (Ji et al., 2020).

A summary mapping between classical and GDS concepts appears in the following table:

Classical GSP	GDS Framework (Distribution-Valued)
$x \in \mathbb{R}^N$	$\mu \in \mathcal{P}_p(\mathbb{R}^N)$
$\widehat{x} = U^{\mathcal{G}^\top}x$	$\widehat{\mu} = (U^{\mathcal{G}^\top})_{\#}\mu$
$F^{\mathcal{G}} x$	$(F^{\mathcal{G}})_{\#}\mu$

When $\mu$ is Dirac, all operations reduce to the classical case.

3. Addressing Limitations of Vector-Based GSP

The traditional vector-valued signal model in GSP entails several limitations:

Synchronous/Complete Observations: Classical methods require all nodes to be observed at the same time; this assumption fails for asynchronous, partially observed, or noisy data.
Absence of Uncertainty Quantification: Deterministic models fail to capture stochasticity or uncertainty due to noise, sensor failures, or modeling errors.
Requirement for Strict Correspondence: Classical graph filtering assumes strict alignment between source and target signals.

The GDS approach overcomes these by modeling incomplete, misaligned, or noisy data as probability distributions on $\mathbb{R}^N$ ; this enables robust inference and prediction under observational imperfections and stochastic data generation (Zhao et al., 30 Sep 2025, Ji et al., 2023):

Incomplete observation or asynchronous acquisition is naturally expressed as marginalizing over unobserved dimensions.
Stochasticity and noise are encoded in the spread and covariance structure of the signal distribution.
Operations such as prediction or filtering become robust to sample alignment due to the distributional viewpoint; strict vector correspondence is not required for signal comparisons or learning.

Empirical results validate these advantages: classical GSP suffers severe degradation in robustness and prediction accuracy under masking and shuffling of observation windows, while GDS-based methods remain stable and accurate (Zhao et al., 30 Sep 2025).

4. Statistical Modeling: Joint Distributions and Dependency Structures

Constructing a GDS generally involves combining local marginals $\mu_i(x_i;\theta_i)$ at each node with a statistical dependency model. A common approach is to use copulas: $\mu_{\theta, \kappa}(x) = c_{\kappa}(F_1(x_1;\theta_1), \dots, F_N(x_N;\theta_N)) \prod_{i=1}^N \mu_i(x_i;\theta_i),$ where $F_i$ is the CDF of the $i$ th marginal and $c_{\kappa}$ is a copula function parameterized by dependency parameters $\kappa$ (Zhao et al., 30 Sep 2025).

When all marginals are Gaussian and $c_{\kappa}$ is a Gaussian copula, the joint is itself a Gaussian on $\mathbb{R}^N$ with mean $m$ and covariance $\Sigma = D R D$ , where $D = \mathrm{diag}(\sigma_1, ..., \sigma_N)$ and $R$ is the correlation matrix.

Graph filtering acts by transforming the entire joint distribution, affecting both mean and covariance; the signal after filtering remains in the same law class (e.g., Gaussian).

This modeling flexibility supports principled filter learning, robust statistics, and various predictive tasks without the need for synchronous or complete data, as demonstrated on pandemic data with missing or shuffled observations (Zhao et al., 30 Sep 2025).

5. Graph Filter Learning in the GDS Framework

Learning optimal graph filters in the distributional setting amounts to minimizing a discrepancy between a filtered signal's law and a target law, typically in the Wasserstein metric. For example, suppose the source is $\mathcal{N}(m, \Sigma)$ and the target is $\mathcal{N}(m_\star, \Sigma_\star)$ . After filtering by $F^{\mathcal{G}}$ , the law is $\mathcal{N}(F^{\mathcal{G}} m, F^{\mathcal{G}} \Sigma F^{\mathcal{G}}^\top)$, and the (squared, $p=2$ ) Wasserstein metric is

$W_2^2 = \|F^{\mathcal{G}} m - m_\star\|_2^2 + \mathrm{Tr}\left(F^{\mathcal{G}} \Sigma F^{\mathcal{G}}^\top + \Sigma_\star - 2 \big[ (\Sigma_\star)^{1/2} (F^{\mathcal{G}} \Sigma F^{\mathcal{G}}^\top) (\Sigma_\star)^{1/2} \big]^{1/2} \right)$

The learning problem is then: $\min_{F^{\mathcal{G}}, R} W_2^2 \quad \text{subject to}\ R = R^\top,\; R \succeq 0,\; \mathrm{diag}(R) = 1.$ This approach supports learning both the filter operator $F^{\mathcal{G}}$ and dependency structure $R$ , employing alternating minimization and gradient-based optimization (Zhao et al., 30 Sep 2025, Ji et al., 2023).

Empirical evidence shows that GDS-based filter learning achieves superior prediction accuracy and robustness in forecasting future graph signals subject to masking or shuffling, conditions under which classical GSP learners degrade or fail.

6. Connections to and Extensions of Classical GSP Theory

GDSs generalize vector-based GSP while fully retaining it as a special case. This is formalized:

Reduction to classical GSP: If the observation is a Dirac measure $\mu = \delta_x$ , all operations (Fourier, filtering, sampling) recover those in classical GSP.
Systematic Map: Every GSP operation (spectral transform, filtering, sampling) has a GDS analogue realized through pushforwards and pullbacks of measures by linear graph operators or their associated bases (Zhao et al., 30 Sep 2025, Ji et al., 2023, Ji et al., 2020).
Unifying stochastic and deterministic settings: The GDS framework unifies models such as random graph operators or signal uncertainty under a common measure-theoretic lens (Ji et al., 2020).

In addition, the GDS approach relates to:

Vector- and function-valued extension of GSP: Vector-valued signals, as considered in (Caputo, 28 May 2025), can be viewed as GDSs where the distributions are supported on a product space with deterministic marginals.
Flexible treatment of uncertainty: The optimal transport machinery (Wasserstein metric) provides a robust notion of distance for both signals and their structural dependencies, unlike the purely $\ell_2$ -based metrics of standard signal processing.

7. Practical Implications and Applications

The GDS formalism is validated on real-world problems such as pandemic prediction (COVID-19 case data for multiple counties), where signals are frequently asynchronous, unpredictable, and noisy (Zhao et al., 30 Sep 2025):

Robust graph filter learning: GDS-based algorithms outperform classical GSP algorithms under masking and shuffling perturbations of the training data.
Flexible handling of uncertainty and missing data: Probability distributions natively represent incomplete observations and uncertainty, ensuring that prediction and reconstruction procedures remain valid and accurate under varied practical conditions.
Extension to networks with uncertain or random topologies: When the underlying graph is itself learned or uncertain, the GDS framework can incorporate distributions over graph operators (Ji et al., 2020, Ji et al., 2023).
Generalization to tasks beyond filtering: The same machinery enables distributional versions of sampling, denoising, and even distributional versions of graph learning, where both the signals and, if desired, the graph topologies are random with respect to some model.

By modeling graph signals as measures in the Wasserstein space, the GDS approach achieves a principled and rigorous generalization of classical GSP. The resulting framework captures uncertainty and irregularity inherent in modern network data, allows direct translations of standard GSP operations to the probabilistic domain, and evidences significantly increased robustness and modeling flexibility relative to classical approaches (Zhao et al., 30 Sep 2025, Ji et al., 2023, Ji et al., 2020).