Hilbert coVariance Networks (HVNs)

Updated 17 September 2025

Hilbert coVariance Networks (HVNs) are convolutional architectures that use the spectral properties of covariance operators in infinite-dimensional settings.
They employ Hilbert coVariance Filters (HVFs) through spectral and polynomial methods to extract functional principal components and higher-order relationships.
HVNs demonstrate robustness and transferability by effectively handling multivariate time series and functional data, outperforming traditional models in empirical validations.

Hilbert coVariance Networks (HVNs) are convolutional neural architectures constructed for signals defined over infinite-dimensional Hilbert spaces, in which processing, transformation, and representation are centered on the covariance operator rather than on pointwise kernels or finite-dimensional graph matrices. HVNs generalize covariance-based learning and convolution to settings that include functional data, multivariate time series, and reproducing kernel Hilbert spaces (RKHS), providing principled mechanisms for robust feature extraction, transferability, and the exploitation of higher-order relationships in high-dimensional or infinite-dimensional signals (Battiloro et al., 16 Sep 2025).

1. Construction of Hilbert coVariance Filters (HVFs) and HVNs

HVNs are fundamentally built from Hilbert coVariance Filters (HVFs) that transform input signals by filtering through the spectral decomposition of a covariance operator. Given a covariance operator $C$ on a Hilbert space $H$ with eigenvalues $\{\lambda_\ell\}_{\ell \geq 1}$ and orthonormal eigenfunctions $\{\varphi_\ell\}_{\ell \geq 1}$ , the canonical spectral representation is

$C v = \sum_{\ell=1}^\infty \lambda_\ell \langle v, \varphi_\ell \rangle \varphi_\ell \quad \forall v \in H.$

The Hilbert coVariance Fourier Transform (HVFT) of a signal $x \in H$ is the sequence of its projections onto these eigenfunctions:

$\tilde{x}[\ell] = \langle x, \varphi_\ell \rangle, \quad \ell \geq 1.$

A spectral HVF with frequency response $h(\cdot)$ acts as

$\mathfrak{h}(C) x = \sum_{\ell=1}^\infty h(\lambda_\ell) \langle x, \varphi_\ell \rangle \varphi_\ell + h(0) x_\perp,$

where $x_\perp$ is the projection onto the kernel of $C$ . This operation allows pointwise manipulation of the frequency components (analogous to filtering in the frequency domain). Alternatively, a spatial (polynomial) HVF is written as

$\mathfrak{h}(C) = \sum_{j=0}^J w_j C^j,$

so that its spectral response is $h(\lambda) = \sum_{j=0}^J w_j \lambda^j$ .

HVNs are constructed by stacking such HVF banks, thereby defining layers:

$x_{t+1}^u = \sigma\left( \sum_{i=1}^{F_t} \mathfrak{h}_t^{(u,i)}(C) x_t^i \right), \qquad u = 1, \dotsc, F_{t+1},$

where $\sigma: H \rightarrow H$ is a pointwise (elementwise in some basis) or more generally nonlinear activation. The full HVN mapping is thus parameterized by $\mathcal{W} = \{\text{HVF weights per layer and bank}\}$ , acting on $C$ and the initial signal collection.

The key distinction from finite-dimensional covariance neural networks (VNNs) is that the convolution/filtering operations are genuinely infinite-dimensional, relying entirely on the operator-theoretic properties of $C$ rather than matrix or fixed kernel constructions (Battiloro et al., 16 Sep 2025).

2. Mathematical and Operator-Theoretic Framework

The foundation of HVNs relies on properties of trace-class (or compact, self-adjoint) covariance operators, spectral integration, and discretization. For a collection of $n$ i.i.d. samples $x_1, \dotsc, x_n \in H$ , the empirical covariance operator is constructed as

$\hat{C}_n v = \frac{1}{n} \sum_{i=1}^n \langle x_i - \bar{x}, v \rangle (x_i - \bar{x}),$

which is self-adjoint, finite-rank, and admits an eigen-decomposition with at most $n$ nonzero eigenvalues.

Filtering and subsequent HVN transformations are implemented using either the full spectral decomposition (if computationally feasible) or polynomial approximations, facilitating efficient computation while preserving theoretical consistency with infinite-dimensional settings. The operator filter $\mathfrak{h}(C)$ satisfies

$\tilde{g}[\ell] = h(\lambda_\ell) \tilde{x}[\ell],$

i.e., the output in each frequency is scaled by $h(\lambda_\ell)$ .

Importantly, HVNs guarantee the ability to exactly replicate projections onto the eigenspaces of the covariance operator using specially constructed polynomial filters. That is, for each distinct nonzero eigenvalue $\alpha$ of $\hat{C}_n$ , there exists a polynomial HVF $\mathfrak{h}_\alpha$ such that

$\mathfrak{h}_\alpha(\hat{C}_n) x = P_\alpha x,$

where $P_\alpha$ is the orthogonal projector onto the eigenspace of $\alpha$ . This property allows recovery of functional principal components (FPCA) by appropriately filtering the signal (Battiloro et al., 16 Sep 2025).

3. Discretization, Implementation, and Connection to Empirical Analysis

As practical implementation of HVNs requires working with finite data and computation, the framework introduces a discretization operator $S_m : H \rightarrow \mathbb{R}^m$ . For instance, in $L^2$ function spaces, $S_m$ is often a bin-average map; for sequence spaces, it is projection onto the first $m$ coordinates; in RKHS, it might be evaluation at $m$ greedily chosen or equispaced points.

The empirical covariance matrix in the discrete (compressed) space is then

$\hat{C}_n^{(m)} = S_m \hat{C}_n S_m^*,$

which ensures that empirical filtering remains consistent with the operator-level filtration, even as $m \rightarrow \infty$ (Proposition 1 in (Battiloro et al., 16 Sep 2025)).

The discretization allows HVNs to be deployed in a variety of settings, including but not limited to:

$L^2$ function spaces via binwise averaging
Multivariate time series
$\ell^2(\mathbb{N})$ sequence modeling
Reproducing kernel Hilbert spaces, using the kernel trick and pointwise evaluations

This approach permits the transfer of mathematically rigorous infinite-dimensional filtering to practical, computationally tractable settings without sacrificing the structural connections to the underlying Hilbertian data.

4. Functional PCA Recovery and Theoretical Guarantees

HVNs possess the property that, via their spectral (or polynomial) filters, they can exactly extract the principal components of the sample covariance operator:

For each positive eigenvalue $\alpha$ of $\hat{C}_n$ , the output $\mathfrak{h}_\alpha(\hat{C}_n) x$ is the projection of $x$ onto the corresponding eigenspace.
Taking inner products with the orthonormal eigenvectors (or functions) yields the principal scores.

This capability, proven as Theorem 1 in (Battiloro et al., 16 Sep 2025), provides strong guarantees that the network can always recover the information underlying FPCA, but can further enrich it through composition of nonlinearities and by leveraging the cross-channel covariance structure, which cannot be fully exploited by FPCA or standard multilayer perceptrons (MLPs).

The HVN framework thus generalizes classical covariance-based dimensionality reduction by embedding it in a nonlinearly parameterized, layered operator filtering and feature extraction network.

5. Robustness, Transferability, and Empirical Validation

HVNs are validated through extensive experiments on both synthetic and real-world time-series classification tasks:

On synthetic datasets involving multivariate Gaussian processes, HVNs outperform both MLPs and FPCA-based classifiers: the discriminative structure is captured through cross-channel covariance, which component-wise models cannot represent.
On real data (e.g., ECG5000), HVNs achieve consistently higher accuracy across discretization levels compared to both MLP and FPCA classifiers.

These empirical results illustrate two key benefits:

Robustness: HVNs are less prone to overfitting and better handle low-sample/noisy regimes due to leveraging the covariance structure in filtering.
Transferability: Because their architecture is parameterized via the covariance operator rather than fixed-dimensional matrices, HVNs trained under one discretization generalize to data with different resolutions or even different Hilbert space representations, reflecting versatility analogous to group-equivariant CNNs but for the setting of functional and kernel data (Battiloro et al., 16 Sep 2025).

6. Relevance to Broader Covariance-Based and Kernel Learning Paradigms

The HVN framework seamlessly unifies and extends several related lines of research:

Robust kernel covariance operators and cross-covariance operators as robustified building blocks for handling contaminated data (Alam et al., 2016)
Divergence-based comparisons between covariance operators for quantifying distributional distance between network nodes or features (Quang, 2019)
Operator-valued positive definite kernels and Hilbert space–valued Gaussian processes for defining layer activations and covariance-preserving features (Jorgensen et al., 23 Apr 2024)
Low-rank approximations and posterior covariance updates for model reduction and uncertainty quantification in infinite dimensions (Carere et al., 31 Mar 2025)
Stability/transferability results inherited from finite-dimensional VNNs but extended to RKHS or general Hilbert space settings (Sihag et al., 2022, Battiloro et al., 16 Sep 2025)

A plausible implication is that HVNs form a natural bridge between modern geometric deep learning, kernel methods, and (infinite-dimensional) functional data analysis. Their operator-centric design affords principled generalization, resilience to data imperfections, and efficient selection of informative subspaces for both learning and inference in high-dimensional structured domains.

7. Summary Table: Core Constructs in HVNs

Construct	Mathematical Role	Implementation Notes
Covariance operator $C$	Generates spectral decomposition, basis for filtering	Empirical, trace-class
Hilbert coVariance Filter (HVF)	$\mathfrak{h}(C)$ acts spectrally or via polynomial	Banked, learnable per layer
HVN layer	Stacks HVFs + nonlinear activation $\sigma$	Extensible to deep architectures
Discretization operator $S_m$	Compresses $H$ to $\mathbb{R}^m$ for computation	Basis- or pointwise-agnostic
FPCA recovery	HVFs retrieve FPCA projections via spectral filtering	Exact for sample eigenspaces
Empirical covariance matrix $\hat{C}_n^{(m)}$	Implements compressed filtering	Compatible with HVF design

In summary, Hilbert coVariance Networks provide a rigorously constructed, operator-theoretic extension of graph and covariance neural networks to the infinite-dimensional setting, enabling robust, transferable, and theoretically grounded learning in functional data, multivariate sequences, and RKHS. They simultaneously subsume FPCA, stabilize against overfitting, and naturally align with the mathematical structure of covariance-based statistical learning (Battiloro et al., 16 Sep 2025).