Papers
Topics
Authors
Recent
2000 character limit reached

Random Fourier Signature Features

Updated 25 November 2025
  • Random Fourier Signature Features are a scalable method for approximating signature kernels, enabling efficient similarity measurement over sequential data.
  • The approach integrates tensor algebra with level-wise independent Random Fourier Feature maps to provide unbiased estimators with strong uniform error guarantees.
  • Reduction variants like RFSF-DP and RFSF-TRP optimize computational and memory complexity, making the method practical for large-scale time series and high-dimensional data.

Random Fourier Signature Features (RFSF) provide a scalable framework for approximating the signature kernel—a powerful similarity measure for sequential data—by leveraging random Fourier feature (RFF) methods within the tensor algebra of signature representations. This combination yields unbiased, uniform-approximation estimators for kernel methods on sequences, reducing computational barriers associated with classic signature kernel computation and enabling practical application to very large datasets while preserving expressive kernel structure (Toth et al., 2023).

1. The Signature Kernel and Tensor Algebra

Given a metric space XRdX \subset \mathbb{R}^d, a sequence x=(x1,,xL)x = (x_1, \ldots, x_L) of points in XX, and a base kernel k:X×XRk: X \times X \to \mathbb{R} with corresponding RKHS H\mathcal{H}, the signature kernel encodes multilevel sequential interactions through the discrete signature map. The mm-th level signature of xx is a tensor in Hm\mathcal{H}^{\otimes m}, constructed by iteratively taking tensor products of differences along the path. Truncating at level MM gives the truncated signature kernel: KM(x,y)=σM(x),σM(y)T(H)=m=0MiΔm(x1),jΔm(y1)=1mδi,j2k(xi,yj)K^{\leq M}(x, y) = \langle \sigma^{\leq M}(x), \sigma^{\leq M}(y) \rangle_{T(\mathcal{H})} = \sum_{m=0}^M \sum_{i \in \Delta_m(\ell_x-1),\, j \in \Delta_m(\ell_y-1)} \prod_{\ell=1}^m \delta^2_{i_\ell,j_\ell}k(x_{i_\ell},y_{j_\ell}) where δi,j2k(xi,yj)\delta^2_{i,j}k(x_i, y_j) denotes the second-order difference of kk (Toth et al., 2023). Computing the Gram matrix for NN sequences of length LL incurs O(N2L2d)O(N^2 L^2 d) time, rendering direct application infeasible at scale.

2. Random Fourier Features for Signature Kernels

Random Fourier Features accelerate kernel methods via a mapping φD(x)\varphi_D(x) (of dimension DD) such that φD(x),φD(y)\langle \varphi_D(x), \varphi_D(y) \rangle is an unbiased estimator of a translation-invariant kernel k(xy)k(x-y). For the Gaussian (RBF) kernel, this mapping takes the form (1/D)[eiω1,x,...,eiωD,x](1/\sqrt{D})[e^{i\langle \omega_1,x \rangle}, ..., e^{i\langle \omega_D,x \rangle}] with ωi\omega_i drawn from the kernel's spectral measure.

The RFSF approach replaces the "static" feature map ϕ(x)=k(,x)\phi(x) = k(\cdot, x) in the discrete signature computation by level-wise independent RFF maps φD(m)(x)\varphi_D^{(m)}(x). For a truncation level MM and feature size DD, independent RFF matrices W(m)W^{(m)} are drawn for each m=1,...,Mm = 1,...,M. The level-mm RFSF kernel is then constructed using: k~m(x,y)=1D=1Deiω(m),xy\tilde{k}_m(x, y) = \frac{1}{D} \sum_{\ell=1}^D e^{i\langle \omega_\ell^{(m)}, x - y \rangle} and signature features as

ΦM(x)=m=0MiΔm(x1)δφ1(xi1)δφm(xim)\Phi_M(x) = \sum_{m=0}^M \sum_{i \in \Delta_m(\ell_x-1)} \delta \varphi_1(x_{i_1}) \otimes \cdots \otimes \delta \varphi_m(x_{i_m})

where δφm(xj)=φ(m)(xj+1)φ(m)(xj)\delta \varphi_m(x_j) = \varphi^{(m)}(x_{j+1}) - \varphi^{(m)}(x_j) (Toth et al., 2023). The resulting kernel,

KMRFSF(x,y)=ΦM(x),ΦM(y)K^{RFSF}_M(x, y) = \langle \Phi_M(x), \Phi_M(y) \rangle

is an unbiased estimator of KM(x,y)K^{\leq M}(x, y).

3. Uniform Approximation Guarantees

RFSF enjoys high-probability, uniform approximation guarantees on compact domains. For fixed MM and sequences x,yx, y with bounded 1-variation, the supremum error between KM(x,y)K^{\leq M}(x, y) and its RFSF estimator is subexponentially small in DD: P(supx,y:x1,y1VKm(x,y)KmRFSF(x,y)ε)mCd,Xα(ε)exp[d2(d+1)(S2+R)β(ε)]\mathbb{P} \bigg( \sup_{x, y: \|x\|_1, \|y\|_1 \leq V} |K_m(x, y) - K^{RFSF}_m(x, y)| \geq \varepsilon \bigg) \leq m C_{d, X} \alpha(\varepsilon) \exp\bigg[-\frac{d}{2(d+1)(S^2+R)} \beta(\varepsilon) \bigg] where constants depend on d,M,V,R,Sd, M, V, R, S and the kernel's Lipschitz constant. This enables setting D=O((1/ε2)log(Cd,X/δ))D = O((1/\varepsilon^2)\log(C_{d,X}/\delta)) to attain error at most ε\varepsilon with failure probability δ\delta (Toth et al., 2023). The proof applies recursive bias-propagation in tensor levels and Bernstein-type concentration in Banach spaces.

4. Scalable Tensor Reduction Variants: RFSF-DP and RFSF-TRP

Although RFSF is linear in sequence length LL, feature dimension scales as O(DM)O(D^M). Two reduction strategies, diagonal-projection (RFSF-DP) and tensor-random-projection (RFSF-TRP), alleviate this:

  • RFSF-DP: Projects onto diagonal tensor entries by averaging over DD independent RFFs at each level; total dimension is D(2M+11)D (2^{M+1}-1).
  • RFSF-TRP: Applies Johnson–Lindenstrauss-type random projections respecting the tensor CP structure, mapping (R2D)mRD(\mathbb{R}^{2D})^{\otimes m} \to \mathbb{R}^D via rank-1 CP projections, yielding total dimension MDM D.

Both offer provable concentration inequalities: RFSF-DP has subexponential and RFSF-TRP has $1/(2m)$-subexponential tails in DD. Extraction costs are O(NLD2M)O(N L D 2^M) for RFSF-DP and O(NLD2M)O(N L D^2 M) for RFSF-TRP. These properties ensure feasibility in high-throughput settings (Toth et al., 2023).

5. Empirical Scaling, Complexity, and Accuracy

On benchmark datasets, RFSF-DP and RFSF-TRP demonstrate negligible loss relative to the exact signature kernel (KSig) on moderate sizes (N1000N \leq 1000) and superior performance versus alternative scalable approaches (Random Warping Series, flattened RFF) at larger scale (N1000N \geq 1000). On the SITS1M satellite dataset (N=106N=10^6), RFSF-DP training with D1000D \approx 1000, M=4M=4 completes in minutes, unattainable by other signature-based or kernel approaches.

Method Time per fit Memory
KSig (full) O(N2L2(M+d))O(N^2 L^2 (M+d)) O(N2)O(N^2)
Classical RFF O(NLd2)O(N L d^2) O(Nd2)O(N d^2)
RFSF-DP O(NLD2M)O(N L D 2^M) O(ND2M)O(N D 2^M)
RFSF-TRP O(NLD2M)O(N L D^2 M) O(NDM)O(N D M)

Accuracy is competitive: on SITS1M, RFSF-DP achieves 74%\sim 74\% test accuracy, compared to 61%\sim 61\% for RWS and 72%\sim 72\% for classical RFF (Toth et al., 2023).

6. Relation to Classical Random Fourier Features and High-Dimensional Learning

The construction of RFSF is rooted in Random Fourier Features as introduced in prior work (Liao et al., 2020), where RFFs are shown to give unbiased estimators for shift-invariant kernels and permit high-dimensional asymptotics. In the classical regime with large feature dimension NN, the empirical Gram matrix of RFF converges (in expectation) to the underlying kernel matrix. However, in the joint high-dimensional setting—where data dimension pp, number of samples nn, and NN scale comparably—the convergence is only in expectation and requires careful analysis. The explicit integration of RFFs into sequential signatures as in RFSF extends these techniques to the tensorial, non-Euclidean context, yielding expressive yet scalable representations (Toth et al., 2023, Liao et al., 2020).

7. Summary and Outlook

Random Fourier Signature Features inherit the expressivity and theoretical guarantees of signature kernels while providing strong uniform error controls and enabling linear computational scaling in both sequence length and data set size. Reduction variants RFSF-DP and RFSF-TRP further extend applicability to million-scale time series, with consistent empirical robustness and accuracy. The methodology aligns RFF-based kernel approximation with the algebraic richness of signatures, offering a principled approach for scalable, powerful sequential similarity in machine learning and data analysis (Toth et al., 2023).

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Random Fourier Signature Features.