RBF-Lifted Signature Kernel

Updated 31 December 2025

RBF-lifted signature kernel is a universal, positive definite kernel that lifts continuous paths into a reproducing kernel Hilbert space using Gaussian RBF and signature transforms.
It employs random Fourier feature approximations to efficiently scale the computation of signature kernels while ensuring uniform error bounds under sub-Gaussian assumptions.
Variants like Diagonal Projection and Tensor Random Projection provide effective trade-offs, making the approach practical for large-scale time-series and sequence analysis.

The RBF-lifted signature kernel is a universal and characteristic positive definite kernel on the space of continuous paths, combining the expressivity of tensor algebra signatures with the nonlinear similarity afforded by the Gaussian radial basis function (RBF). It measures path similarity by lifting Euclidean increments into a reproducing kernel Hilbert space (RKHS), applying the signature transformation, and computing the Hilbert-Schmidt inner product in the resulting tensor algebra. Recent research demonstrates both explicit constructions and scalable random-feature approximations for this kernel, enabling efficient application to large-scale sequence and time-series analysis tasks (Toth et al., 2023, Piatti et al., 29 Dec 2025).

1. Mathematical Definition and Construction

Let $x, y : [0,T] \to \mathbb{R}^d$ be continuous paths of finite $p$ -variation. The full signature of $x$ over $[0,T]$ is given by

$\operatorname{Sig}(x)_{0,T} = \left(1,\,S^1(x),\,S^2(x),\,\dots\right) \in T((\mathbb{R}^d))$

where

$S^k(x) = \int_{0 < t_1 < \cdots < t_k < T} dx_{t_1} \otimes \cdots \otimes dx_{t_k} \in (\mathbb{R}^d)^{\otimes k}.$

For the RBF-lifted signature kernel,

The static RBF kernel is $\kappa_\sigma(u, v) = \exp\Big(-\frac{1}{2\sigma^2}\|u - v\|^2\Big)$ .
Its RKHS, $\mathcal{H}_\sigma$ , is $L^2(\mathbb{R}^d, d\mu_\sigma)$ , with $d\mu_\sigma$ being the spectral measure associated to $\kappa_\sigma$ via Bochner’s theorem.
The feature map is $\phi(u)(\omega) = e^{i\omega^\top u}$ .

To construct the kernel,

Lift the path $x$ pointwise into $\mathcal{H}_\sigma$ via $t \mapsto \phi(x_t)$ ,
Compute the signature in $T((\mathcal{H}_\sigma))$ ,
The RBF-lifted signature kernel is

$K_{\mathrm{RBF}}(x, y) := \left\langle \operatorname{Sig}(\phi \circ x)_{0,T},\, \operatorname{Sig}(\phi \circ y)_{0,T} \right\rangle_{T((\mathcal{H}_\sigma))} = \sum_{k=0}^\infty \left\langle S^k(\phi \circ x),\, S^k(\phi \circ y)\right\rangle_{\mathcal{H}_\sigma^{\otimes k}}.$

This construction is universal and characteristic on path space and possesses invariance and stability properties inherited from both the signature and the RBF kernel (Piatti et al., 29 Dec 2025).

2. Random Fourier Signature Feature Approximations

Direct computation of $K_{\mathrm{RBF}}$ is infeasible for long paths due to exponential growth in tensor dimensions. Random Fourier feature-based acceleration replaces the static embedding by randomized mappings to approximate the inner products.

Draw $m$ i.i.d. samples $w_1, \ldots, w_m \sim \mathcal{N}(0, \sigma^{-2}I_d)$ .
The RFF map is: $\psi(x) = \frac{1}{\sqrt{m}} \left[\cos(w_1^\top x), \sin(w_1^\top x), \dots, \cos(w_m^\top x), \sin(w_m^\top x)\right]\in \mathbb{R}^{2m}.$
In signature computations, replace kernel embeddings $\delta k$ with $\delta\psi$ at each step and truncate at signature level $M$ .

The resultant random feature signature map is: $\Psi^{\leq M}(X) = \sum_{r=0}^M \sum_{1 \leq i_1 < \cdots < i_r \leq L-1} \left[ \delta \psi_1(x_{i_1}) \otimes \cdots \otimes \delta \psi_r(x_{i_r}) \right],$ whose inner product yields an unbiased estimator for the truncated RBF-lifted signature kernel: $\widehat{K}^{\leq M}(X, Y) = \left\langle \Psi^{\leq M}(X),\, \Psi^{\leq M}(Y)\right\rangle.$ The expected value of this estimator is exactly the truncated signature kernel, $\mathbb{E}[\,\widehat{K}^{\leq M}(X, Y)\,] = K^{\leq M}(X, Y)$ (Toth et al., 2023).

3. Uniform Approximation Guarantees

Concentration inequalities bound the uniform error of the RFF-accelerated kernel over compact path spaces.

For compact convex $X \subset \mathbb{R}^d$ with paths of bounded $1$-variation $\|X\|_{1\text{-var}} \leq V$ , under sub-Gaussian moment bounds on the RFF distribution, there exist constants $C, \beta > 0$ such that for each signature level $m$ and error $\varepsilon$ : $P\left(\sup_{X, Y}\, \left| \widehat{K}_m(X, Y) - K_m(X, Y) \right| \geq \varepsilon \right) \leq C \exp\left(-c\, \frac{m\, \varepsilon^2}{\beta^2}\right),$ with precise bounds detailed in Theorem 3.1 and equation (3.13) (Toth et al., 2023). To guarantee uniform error $\epsilon$ for all $m \leq M$ with probability $1 - \delta$ , one requires $m = \tilde{O}(1/\epsilon^2 \,\log(CM/\delta))$ RFF draws.

4. Computational Complexity and Scalable Variants

The exact signature kernel (e.g., as in Király–Oberhauser 2019) incurs $O(N^2 L^2 M)$ time for $N$ paths of length $L$ and truncation $M$ . RBF-lifted signature kernels with RFF approximation scale as: $\text{time} = O(N L M m), \quad \text{memory} = O(N m L) \text{ (or } O(Nm) \text{ for streaming)},$ eliminating the quadratic dependence in both dataset size and sequence length (Toth et al., 2023).

To further improve scalability, two variants are introduced:

Variant	Feature Dimension	Time Complexity
Diagonal Projection (DP)	$2^k$ (per-level)	$O(N L (m d + 2^M))$
Tensor Random Projection (TRP)	$O(M d)$	$O(N L M d^2)$

DP averages only the diagonal of the RFF tensor product, reducing features at the expense of slower concentration.
TRP sketches each tensor by a CP-rank-1 Gaussian map, yielding sub-exponential convergence with respect to the number of projections.

Empirical results demonstrate negligible accuracy loss for moderate $m$ and $M$ —with both DP and TRP matching the full signature kernel on time series benchmarks and scaling efficiently to $10^6$ sequences (Toth et al., 2023).

5. Dynamic Lifting via RF-CDE

A complementary approach uses dynamic random-feature reservoirs via random Fourier controlled differential equations (RF-CDEs):

At each time $t$ , map $x_t$ to RFFs as $X^F_t = \phi^F(x_t) \in \mathbb{R}^{2F}$ .
Feed this lifted signal to a random linear CDE: $d Z_t^{N, F} = \frac{1}{\sqrt{N}} \sum_{j=1}^{2F} A_j Z_t^{N, F} dX_t^{F, j}, \quad Z_0^{N, F}\sim\mathcal{N}(0,I_N),$ yielding a feature vector $\psi_{N,F}(x) = N^{-1/2} Z^{N,F}_T(x) \in \mathbb{R}^N$ . Only a linear readout on top of $\psi_{N,F}(x)$ is trained.

In the infinite-width limit $(N,F)\to(\infty,\infty)$ , $\frac{1}{N} \mathbb{E}[ \langle Z_T^{N,F}(x), Z_T^{N,F}(y) \rangle ] \to K_{\mathrm{RBF}}(x, y)$ , rigorously establishing that RF-CDEs realize the RBF-lifted signature kernel as their limiting covariance (Piatti et al., 29 Dec 2025).

6. Theoretical Properties

$K_{\mathrm{RBF}}$ is positive definite (Mercer kernel), universal, and characteristic. Universality and characteristicness are inherited from the classical signature kernel and the RBF kernel on $\mathbb{R}^d$ (Piatti et al., 29 Dec 2025).

Reparameterization invariance results from the time invariance of path signatures.
Stability under $p$ -variation: small perturbations in path yield small changes in $K_{\mathrm{RBF}}$ .
Equivariance with respect to space translation and rotation is inherited from the RBF base kernel.

This kernel provides a continuous-time, non-Euclidean analogue of RBF feature learning in sequence learning.

Selection of hyperparameters follows empirical trade-offs:

$m\sim 10^2\!-\!10^3$ suffices for $<1\%$ kernel error.
Use the Gaussian spectral measure or improvements such as quasi-Monte Carlo or orthogonal realizations.
Truncation $M=3\!-\!5$ captures principal path interactions with exponential cost beyond.
For large-dimensional data and small $M$ , DP is recommended; for moderate $d$ and $M$ , TRP reduces memory; vanilla RFSF has superior concentration but balloons exponentially in $m^M$ .

The rough signature kernel is the limiting case when no nonlinear RBF warping is applied, corresponding to linear (log-)signature propagation in a random reservoir (Piatti et al., 29 Dec 2025). RF-CDE and R-RDE offer two complementary methods whose infinite-width limits recover the RBF-lifted and rough signature kernels, respectively, unifying perspectives on random-feature reservoirs and continuous-time deep sequence models.

References

"Random Fourier Signature Features" (Toth et al., 2023)
"Random Controlled Differential Equations" (Piatti et al., 29 Dec 2025)

PDF Markdown Chat (Pro)

References (2)

Random Fourier Signature Features (2023)

Random Controlled Differential Equations (2025)

Whiteboard

Generate a whiteboard explanation of this topic.

Topic to Video (Beta)

Generate a video overview of this topic.

Follow Topic

Get notified by email when new papers are published related to RBF-Lifted Signature Kernel.