Papers
Topics
Authors
Recent
2000 character limit reached

STARK: Adaptive Denoising in Spatial Transcriptomics

Updated 15 December 2025
  • The paper introduces STARK, a novel denoising method that combines kernel ridge regression with an adaptive graph Laplacian to recover gene expressions in ultra-low sequencing regimes.
  • Methodologically, it employs an alternating minimization scheme that dynamically updates spatial and expression-based affinities to maintain sharp cell-type boundaries and enable spatial interpolation.
  • Empirical evaluations on real datasets demonstrate that STARK outperforms existing techniques, achieving higher label transfer accuracy and improved preservation of cell-type geometry.

Spatial Transcriptomics via Adaptive Regularization and Kernels (STARK) is a denoising methodology for spatial transcriptomics data, specifically designed to perform robust gene expression recovery in settings characterized by ultra-low sequencing depths. The approach synergistically combines kernel ridge regression with a dynamically adaptive graph Laplacian regularizer. This enables effective noise suppression, sharp boundary preservation between cell types, and supports gene expression interpolation at arbitrary spatial coordinates. The STARK framework delivers closed-form solutions for each alternating minimization subproblem, securing both algorithmic and statistical convergence guarantees. STARK’s performance, evaluated on real biological datasets, demonstrates superior label transfer accuracy and cell-type geometry preservation relative to contemporary graph-based and manifold-denoising techniques (Kubal et al., 10 Dec 2025).

1. Motivation and Background

Spatial transcriptomics platforms (e.g. Stereo-seq, 10x Visium, Slide-seq) generate high-dimensional gene-expression vectors F(q)RdF^\star(q)\in\R^d at spatial locations $q\in\Q\subset\R^2$. Although modern technologies increase the field of view, sequencing cost per pixel scales with the total number of reads. Operation within the ultra-low regime (O(102)\mathcal O(10^2) reads/pixel) precipitates severe multinomial or Poisson noise and dropouts, which results in very sparse observations YiRdY_i\in\R^d. The spatial sampling is irregular, undermining use of standard convolutional filters. Generic denoisers (e.g. Total Variation, non-local means) presuppose scalar images and regular grids; single-cell imputation techniques (MAGIC, SAVER, scImpute) neglect spatial structure; and manifold-learning methods such as SPROD build a one-shot graph from noisy data, often failing when input is highly corrupted.

A viable solution must integrate spatial smoothing for denoising, maintain sharp inter-cell-type boundaries, incrementally adapt the underlying image model, offer provable guarantees, and support interpolation for arbitrary qq.

2. Mathematical Formulation

Let observed pixels be $\{q_i\}_{i=1}^m \subset \Q$ with measurements Yi=F(qi)+ViY_i = F^\star(q_i) + V_i, where ViV_i represents noise. STARK seeks a function $F: \Q \to \R^d$ in a vector-valued RKHS $\H^d$, alongside row-stochastic weights WR+m×mW \in \R_+^{m \times m} defined on a neighbor graph $\E_\tau = \{(i,k) : \|q_i-q_k\|\le\tau\}$. The objective function J(F,W)J(F,W) incorporates:

  • Data-fit: 1mi=1mYiF(qi)2\frac{1}{m}\sum_{i=1}^m \|Y_i - F(q_i)\|^2
  • RKHS Ridge: $\lambda\|F\|_{\H^d}^2,\ \lambda > 0$
  • Graph Laplacian regularizer: $\frac{\omega}{2m}\sum_{(i,k)\in \E_\tau} W_{ik} \|F(q_i)-F(q_k)\|^2$
  • Entropy term for WW: steers regularization using spatial and gene-expression proximity

The joint minimization problem is block-convex and given by: $\min_{F \in \H^d, W \in \C_\tau} J(F,W)$ where $\C_\tau$ is the convex set ensuring neighborhood, normalization, and non-negativity constraints. The representer theorem permits reduction to finite-dimensional optimization. For $\bK$ the kernel matrix and Θ\Theta parameter coefficients,

$F(\cdot) = \frac{1}{\sqrt{m}}\sum_{i=1}^m \K(q_i, \cdot)\theta_i, \quad \bF = \sqrt{m}\,\bK\Theta$

The augmented loss reads: $J(F,W)= \frac{1}{m} \|\bY-\sqrt{m}\,\bK\Theta\|_F^2 + \lambda \Tr(\Theta^\top\bK\Theta) + \omega\Bigl\{\Tr\!\bigl[\Theta^\top\bK\bar L_W\bK\Theta\bigr]+E(W)\Bigr\}$ with LˉW\bar L_W denoting the Laplacian.

3. Incrementally Adaptive Graph Laplacian

At initialization, edges are purely spatial: W~ik(0)=exp(qiqk2/s22)1{qiqkτ},W(0)=RowNormalize(W~(0))\tilde W^{(0)}_{ik} = \exp(-\|q_i-q_k\|^2/s_2^2)\,1_{\{\|q_i-q_k\|\le\tau\}}, \quad W^{(0)}=\mathrm{RowNormalize}(\tilde W^{(0)})

For each iteration tt, given the current estimate FtF^t,

W~ik(t+1)=exp(Ft(qi)Ft(qk)2/s12)exp(qiqk2/s22)1{qiqkτ}\tilde W^{(t+1)}_{ik} = \exp(-\|F^t(q_i)-F^t(q_k)\|^2 / s_1^2) \exp(-\|q_i-q_k\|^2 / s_2^2) 1_{\{\|q_i-q_k\|\le\tau\}}

Rows are normalized to sum to 1. This formulation is derived as the closed-form minimizer of the entropic graph Laplacian regularizer component, balancing spatial and expression-based affinity.

4. Alternating Minimization and Algorithmic Structure

STARK employs an alternating minimization (block coordinate descent) scheme:

  1. F-update (Kernel Ridge Regression):

$(\bK^2 + \lambda \bK + \omega \bK\bar L_{W^t}\bK) \Theta^{t+1} = \frac{1}{\sqrt{m}} \bK\bY, \quad \Theta \in \Range(\bK)$

The solution:

$\Theta^{t+1} = \frac{1}{\sqrt{m}}(\bK^2 + \lambda\bK + \omega\bK\bar L_{W^t}\bK)^+ \bK\bY$

  1. W-update (Entropic Graph Fitting):

W^ikt+1=W~ik(t+1)W~i(t+1)\hat W^{t+1}_{ik} = \frac{\tilde W^{(t+1)}_{ik}}{\sum_\ell \tilde W^{(t+1)}_{i\ell}}

  1. Iteration: Repeat for NN steps; typically N=7N=7 suffice.
  2. Complexity: Dominated by O(m3)O(m^3) operations per iteration owing to matrix inversion or factorization. For m5,000m \approx 5,000, this is tractable on modern hardware, with additional efficiency available via Laplacian sparsity and iterative linear solvers.

5. Theoretical Guarantees

Under modeled noise regimes and total reads RR, with regularization parameters λ=O(R1)\lambda=O(R^{-1}), ω=O(R1)\omega=O(R^{-1}):

  • Statistical Convergence: Any stationary point (Fˉ,Wˉ)(\bar F, \bar W) of JJ achieves

E[FˉFLm2]=O(R1/2)\mathbb{E}[\|\bar F - F^\star\|_{L^2_m}] = O(R^{-1/2})

by comparison to oracle graph minimization and leveraging concentration for multinomial noise.

  • Algorithmic Convergence: Alternating updates converge to a stationary point via block-convexity, unique block minimizers, and use of properties from coordinate descent theory (Kurdyka–Łojasiewicz property, Attouch–Bolte framework).

6. Empirical Evaluation and Benchmarking

Empirical assessment is conducted on the Mouse Organogenesis Spatiotemporal Transcriptomic Atlas (MOSTA) E9.5 snapshot via Stereo-seq, with 15,717 genes across 5,503 pixels (R07.6×107R_0\simeq 7.6\times 10^7 reads).

Evaluation Metrics:

  • Label transfer accuracy: PCA and kk-NN on low-dimensional embeddings of cell-type labels (fraction correct).
  • kNN overlap: Fractional match between directed kNN graphs on original and denoised data.
  • Relative error: $\|\bar\bF - \bF_0\|_F / \|\bF_0\|_F$.

Comparison is performed against SPROD (Wang et al.), GraphPCA (Yang et al.), and STAGATE (Dong et al.), each optimally tuned for label accuracy.

Performance:

  • STARK exceeds competing methods for label transfer accuracy and kNN overlap, especially pronounced at ultra-low read depths (R/m[14,200]R/m \in [14, 200] reads/pixel).
  • At R/m100R/m\approx 100 reads/pixel: STARK 0.85\approx 0.85 accuracy vs SPROD 0.80\approx 0.80, GraphPCA 0.74\approx 0.74, STAGATE 0.70\approx 0.70.
  • GraphPCA and STAGATE yield lower relative error but less effective geometry preservation of cell types.
  • Interpolation capabilities to unseen locations are demonstrated through subsampling experiments.

Visual Analysis:

  • Denoised cell-type spatial maps and variant comparisons illustrate performance (Figure 1, Figure 2 in source).

7. Practical Implementation Considerations

Hyperparameter Selection:

  • Kernel length scale (ll): set radius for ≈7 neighbors/pixel; s2=l,τ=1.5ls_2 = l, \tau = 1.5l.
  • Post-first iterate, s1s_1 is set to the 75th percentile of pairwise gene-expression distances.
  • Regularization: $\lambda = \alpha\|\bK\|_{\text{op}}$, $\omega = \alpha \cdot 6\|\bK\|_{\text{op}}$, α\alpha optimized by matching residuals to read count statistics.
  • Iterations: N=7N=7 is empirically sufficient.

Validation:

  • Hyperparameters are optimized by downsampling real counts to validation sets for each read depth.

Computational Resource Notes:

  • For sample sizes (m5,000m \approx 5,000), direct matrix resolution (O(m3)O(m^3)) requires a few seconds on multi-core architectures; further efficiency possible through Laplacian sparsity and iterative solvers.

STARK’s synthesis of kernel ridge regression and incrementally adaptive Laplacian regularization establishes it as a practical, edge-aware denoising tool for sparse, noisy spatial transcriptomics data, supported by strong theoretical guarantees and demonstrably superior empirical performance in the ultra-low sequencing depth regime (Kubal et al., 10 Dec 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to Spatial Transcriptomics via Adaptive Regularization and Kernels (STARK).

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube