Papers
Topics
Authors
Recent
2000 character limit reached

RBF-Lifted Signature Kernel

Updated 31 December 2025
  • RBF-lifted signature kernel is a universal, positive definite kernel that lifts continuous paths into a reproducing kernel Hilbert space using Gaussian RBF and signature transforms.
  • It employs random Fourier feature approximations to efficiently scale the computation of signature kernels while ensuring uniform error bounds under sub-Gaussian assumptions.
  • Variants like Diagonal Projection and Tensor Random Projection provide effective trade-offs, making the approach practical for large-scale time-series and sequence analysis.

The RBF-lifted signature kernel is a universal and characteristic positive definite kernel on the space of continuous paths, combining the expressivity of tensor algebra signatures with the nonlinear similarity afforded by the Gaussian radial basis function (RBF). It measures path similarity by lifting Euclidean increments into a reproducing kernel Hilbert space (RKHS), applying the signature transformation, and computing the Hilbert-Schmidt inner product in the resulting tensor algebra. Recent research demonstrates both explicit constructions and scalable random-feature approximations for this kernel, enabling efficient application to large-scale sequence and time-series analysis tasks (Toth et al., 2023, Piatti et al., 29 Dec 2025).

1. Mathematical Definition and Construction

Let x,y:[0,T]Rdx, y : [0,T] \to \mathbb{R}^d be continuous paths of finite pp-variation. The full signature of xx over [0,T][0,T] is given by

Sig(x)0,T=(1,S1(x),S2(x),)T((Rd))\operatorname{Sig}(x)_{0,T} = \left(1,\,S^1(x),\,S^2(x),\,\dots\right) \in T((\mathbb{R}^d))

where

Sk(x)=0<t1<<tk<Tdxt1dxtk(Rd)k.S^k(x) = \int_{0 < t_1 < \cdots < t_k < T} dx_{t_1} \otimes \cdots \otimes dx_{t_k} \in (\mathbb{R}^d)^{\otimes k}.

For the RBF-lifted signature kernel,

  • The static RBF kernel is κσ(u,v)=exp(12σ2uv2)\kappa_\sigma(u, v) = \exp\Big(-\frac{1}{2\sigma^2}\|u - v\|^2\Big).
  • Its RKHS, Hσ\mathcal{H}_\sigma, is L2(Rd,dμσ)L^2(\mathbb{R}^d, d\mu_\sigma), with dμσd\mu_\sigma being the spectral measure associated to κσ\kappa_\sigma via Bochner’s theorem.
  • The feature map is ϕ(u)(ω)=eiωu\phi(u)(\omega) = e^{i\omega^\top u}.

To construct the kernel,

  • Lift the path xx pointwise into Hσ\mathcal{H}_\sigma via tϕ(xt)t \mapsto \phi(x_t),
  • Compute the signature in T((Hσ))T((\mathcal{H}_\sigma)),
  • The RBF-lifted signature kernel is

KRBF(x,y):=Sig(ϕx)0,T,Sig(ϕy)0,TT((Hσ))=k=0Sk(ϕx),Sk(ϕy)Hσk.K_{\mathrm{RBF}}(x, y) := \left\langle \operatorname{Sig}(\phi \circ x)_{0,T},\, \operatorname{Sig}(\phi \circ y)_{0,T} \right\rangle_{T((\mathcal{H}_\sigma))} = \sum_{k=0}^\infty \left\langle S^k(\phi \circ x),\, S^k(\phi \circ y)\right\rangle_{\mathcal{H}_\sigma^{\otimes k}}.

This construction is universal and characteristic on path space and possesses invariance and stability properties inherited from both the signature and the RBF kernel (Piatti et al., 29 Dec 2025).

2. Random Fourier Signature Feature Approximations

Direct computation of KRBFK_{\mathrm{RBF}} is infeasible for long paths due to exponential growth in tensor dimensions. Random Fourier feature-based acceleration replaces the static embedding by randomized mappings to approximate the inner products.

  • Draw mm i.i.d. samples w1,,wmN(0,σ2Id)w_1, \ldots, w_m \sim \mathcal{N}(0, \sigma^{-2}I_d).
  • The RFF map is: ψ(x)=1m[cos(w1x),sin(w1x),,cos(wmx),sin(wmx)]R2m.\psi(x) = \frac{1}{\sqrt{m}} \left[\cos(w_1^\top x), \sin(w_1^\top x), \dots, \cos(w_m^\top x), \sin(w_m^\top x)\right]\in \mathbb{R}^{2m}.
  • In signature computations, replace kernel embeddings δk\delta k with δψ\delta\psi at each step and truncate at signature level MM.

The resultant random feature signature map is: ΨM(X)=r=0M1i1<<irL1[δψ1(xi1)δψr(xir)],\Psi^{\leq M}(X) = \sum_{r=0}^M \sum_{1 \leq i_1 < \cdots < i_r \leq L-1} \left[ \delta \psi_1(x_{i_1}) \otimes \cdots \otimes \delta \psi_r(x_{i_r}) \right], whose inner product yields an unbiased estimator for the truncated RBF-lifted signature kernel: K^M(X,Y)=ΨM(X),ΨM(Y).\widehat{K}^{\leq M}(X, Y) = \left\langle \Psi^{\leq M}(X),\, \Psi^{\leq M}(Y)\right\rangle. The expected value of this estimator is exactly the truncated signature kernel, E[K^M(X,Y)]=KM(X,Y)\mathbb{E}[\,\widehat{K}^{\leq M}(X, Y)\,] = K^{\leq M}(X, Y) (Toth et al., 2023).

3. Uniform Approximation Guarantees

Concentration inequalities bound the uniform error of the RFF-accelerated kernel over compact path spaces.

For compact convex XRdX \subset \mathbb{R}^d with paths of bounded $1$-variation X1-varV\|X\|_{1\text{-var}} \leq V, under sub-Gaussian moment bounds on the RFF distribution, there exist constants C,β>0C, \beta > 0 such that for each signature level mm and error ε\varepsilon: P(supX,YK^m(X,Y)Km(X,Y)ε)Cexp(cmε2β2),P\left(\sup_{X, Y}\, \left| \widehat{K}_m(X, Y) - K_m(X, Y) \right| \geq \varepsilon \right) \leq C \exp\left(-c\, \frac{m\, \varepsilon^2}{\beta^2}\right), with precise bounds detailed in Theorem 3.1 and equation (3.13) (Toth et al., 2023). To guarantee uniform error ϵ\epsilon for all mMm \leq M with probability 1δ1 - \delta, one requires m=O~(1/ϵ2log(CM/δ))m = \tilde{O}(1/\epsilon^2 \,\log(CM/\delta)) RFF draws.

4. Computational Complexity and Scalable Variants

The exact signature kernel (e.g., as in Király–Oberhauser 2019) incurs O(N2L2M)O(N^2 L^2 M) time for NN paths of length LL and truncation MM. RBF-lifted signature kernels with RFF approximation scale as: time=O(NLMm),memory=O(NmL) (or O(Nm) for streaming),\text{time} = O(N L M m), \quad \text{memory} = O(N m L) \text{ (or } O(Nm) \text{ for streaming)}, eliminating the quadratic dependence in both dataset size and sequence length (Toth et al., 2023).

To further improve scalability, two variants are introduced:

Variant Feature Dimension Time Complexity
Diagonal Projection (DP) 2k2^k (per-level) O(NL(md+2M))O(N L (m d + 2^M))
Tensor Random Projection (TRP) O(Md)O(M d) O(NLMd2)O(N L M d^2)
  • DP averages only the diagonal of the RFF tensor product, reducing features at the expense of slower concentration.
  • TRP sketches each tensor by a CP-rank-1 Gaussian map, yielding sub-exponential convergence with respect to the number of projections.

Empirical results demonstrate negligible accuracy loss for moderate mm and MM—with both DP and TRP matching the full signature kernel on time series benchmarks and scaling efficiently to 10610^6 sequences (Toth et al., 2023).

5. Dynamic Lifting via RF-CDE

A complementary approach uses dynamic random-feature reservoirs via random Fourier controlled differential equations (RF-CDEs):

  • At each time tt, map xtx_t to RFFs as XtF=ϕF(xt)R2FX^F_t = \phi^F(x_t) \in \mathbb{R}^{2F}.
  • Feed this lifted signal to a random linear CDE: dZtN,F=1Nj=12FAjZtN,FdXtF,j,Z0N,FN(0,IN),d Z_t^{N, F} = \frac{1}{\sqrt{N}} \sum_{j=1}^{2F} A_j Z_t^{N, F} dX_t^{F, j}, \quad Z_0^{N, F}\sim\mathcal{N}(0,I_N), yielding a feature vector ψN,F(x)=N1/2ZTN,F(x)RN\psi_{N,F}(x) = N^{-1/2} Z^{N,F}_T(x) \in \mathbb{R}^N. Only a linear readout on top of ψN,F(x)\psi_{N,F}(x) is trained.

In the infinite-width limit (N,F)(,)(N,F)\to(\infty,\infty), 1NE[ZTN,F(x),ZTN,F(y)]KRBF(x,y)\frac{1}{N} \mathbb{E}[ \langle Z_T^{N,F}(x), Z_T^{N,F}(y) \rangle ] \to K_{\mathrm{RBF}}(x, y), rigorously establishing that RF-CDEs realize the RBF-lifted signature kernel as their limiting covariance (Piatti et al., 29 Dec 2025).

6. Theoretical Properties

KRBFK_{\mathrm{RBF}} is positive definite (Mercer kernel), universal, and characteristic. Universality and characteristicness are inherited from the classical signature kernel and the RBF kernel on Rd\mathbb{R}^d (Piatti et al., 29 Dec 2025).

  • Reparameterization invariance results from the time invariance of path signatures.
  • Stability under pp-variation: small perturbations in path yield small changes in KRBFK_{\mathrm{RBF}}.
  • Equivariance with respect to space translation and rotation is inherited from the RBF base kernel.

This kernel provides a continuous-time, non-Euclidean analogue of RBF feature learning in sequence learning.

Selection of hyperparameters follows empirical trade-offs:

  • m102 ⁣ ⁣103m\sim 10^2\!-\!10^3 suffices for <1%<1\% kernel error.
  • Use the Gaussian spectral measure or improvements such as quasi-Monte Carlo or orthogonal realizations.
  • Truncation M=3 ⁣ ⁣5M=3\!-\!5 captures principal path interactions with exponential cost beyond.
  • For large-dimensional data and small MM, DP is recommended; for moderate dd and MM, TRP reduces memory; vanilla RFSF has superior concentration but balloons exponentially in mMm^M.

The rough signature kernel is the limiting case when no nonlinear RBF warping is applied, corresponding to linear (log-)signature propagation in a random reservoir (Piatti et al., 29 Dec 2025). RF-CDE and R-RDE offer two complementary methods whose infinite-width limits recover the RBF-lifted and rough signature kernels, respectively, unifying perspectives on random-feature reservoirs and continuous-time deep sequence models.

References

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to RBF-Lifted Signature Kernel.