SkyGP-Dense: Streaming Gaussian Processes

Updated 7 August 2025

SkyGP-Dense is a streaming Gaussian Process framework that uses kernel-induced expert partitioning to manage strict memory and computational constraints.
It dynamically allocates streaming data to expert models and incrementally updates centers based on kernel similarity to ensure representation.
The framework aggregates experts’ uncertainty in real time, enabling efficient online control and safety-critical applications.

SkyGP-Dense denotes a streaming, kernel-induced, progressively generated expert framework for Gaussian Processes (GPs), specializing in online learning and control with an explicit design to maximize prediction accuracy under strict memory and computational constraints. As a component of the broader SkyGP architecture, SkyGP-Dense implements an adaptive, expert-based partitioning of streaming data, with a bounded-memory replacement mechanism that preserves the theoretical learning guarantees and uncertainty quantification associated with exact GPs, but with computable efficiency suitable for safety-critical real-time systems (Yang et al., 5 Aug 2025).

1. Architecture and Core Principles

The SkyGP framework addresses the computational and scalability bottlenecks of conventional Gaussian Processes—specifically, the cubic computational and quadratic storage complexities associated with kernel matrix inversion when processing large or streaming datasets. SkyGP achieves these improvements by:

Maintaining a dynamically bounded set of GP "experts," each associated with a representative (center) and an adaptive data allocation.
Assigning new streaming data to the nearest expert, with "nearness" defined through a kernel-induced similarity/distance metric, $d_i = 1/\kappa(c_i, x)$ , where $\kappa(\cdot, \cdot)$ denotes the kernel function and $c_i$ is the center of expert $i$ .
Updating expert centers incrementally via:

$c_i^k = \frac{(k-1)\, c_i^{k-1}}{k} + \frac{x^k}{k}$

as new points $x^k$ are assigned to expert $i$ .

A time-aware factor $\theta$ is integrated per expert to reflect usage history and inform adaptive re-weighting.

2. SkyGP-Dense Data Replacement and Memory Bounding

The SkyGP-Dense variant introduces a targeted strategy for maximizing prediction accuracy via memory-efficient data replacement:

Each expert enforces a strict budget $N̄$ on the number of stored samples.
Upon receiving a new sample $(x^k, y^k)$ while at capacity, the expert identifies the stored sample that is maximally distant from its center in the kernel-induced feature space.
This distant point is dropped, and the new sample is ingested, ensuring the expert's memory remains bounded. The dropped data's center is also updated incrementally.

The decision to trigger replacement is determined by an event-based rule:

$\Delta(s) = \kappa(x^s, c_{nr}) - \kappa(x^s, c_{nr}^{off}) - \kappa(x^k, c_{nr}) + \kappa(x^k, c_{nr}^{off})$

Replacement occurs if $\Delta < 0$ :

$D_{nr}(t_k) = D_{nr}(t_{k-1}) \cup \{(x^k, y^k)\} \setminus \{(x^{k_{off}}, y^{k_{off}})\}$

This mechanism ensures that the most representative (by kernel similarity) data are retained, directly prioritizing model accuracy.

3. Distributed Inference and Uncertainty Aggregation

SkyGP employs distributed Gaussian Process (DGP) schemes for inference by combining outputs of multiple experts. Notable ensemble strategies include the Mixture-of-Experts (MOE), Product-of-Experts (PoE), and, as an example, the Bayesian Committee Machine (BCM):

$\varpi_i(x) = \frac{1}{\left(\sum_{j\in I_{agg}} w_j \sigma_j^{-2}(x) + \left(1 - \sum_{j\in I_{agg}}w_j\right)\sigma_*^{-2}\right)}$

where $\sigma_j^2(x)$ are variances from each expert and $w_j$ are expert weights. This ensures uncertainty estimates remain calibrated at the aggregated level, which is critical for downstream control and safety guarantees.

4. Experimental Comparisons and Performance Metrics

Empirical evaluation against methods such as Local GPs, LoG-GP, ISSGP, and SSGP across regression and control benchmarks provides key insights:

On the SARCOS dataset, SkyGP-Dense achieves an SMSE of 0.017 and an MSLL of –2.03, outperforming competing methods in both accuracy and log-loss.
Computational efficiency is demonstrated: ISSGP requires up to 18s for prediction on SARCOS, whereas SkyGP variants achieve update times as low as 0.04s and prediction times of 0.15–0.24s.
In real-time control experiments, the framework yields tighter tracking errors and improved safety, reflecting the benefits of accurate online uncertainty estimation.

SkyGP-Dense update complexity is $O(W \log(\mathcal{N}) + \mathcal{N}N̄^3)$ (due to Cholesky factorization for bounded dataset size $N̄$ per expert and number of experts $\mathcal{N}$ ), and prediction per step is $O(\mathcal{N}N(t_k)^2)$ .

5. Application Domains and Theoretical Guarantees

SkyGP-Dense is positioned for use in safety-critical online learning, adaptive control of dynamic systems (autonomous underwater vehicles, medical devices, aerial robotics), and other real-time environments where computational resources are finite and uncertainty-aware prediction is necessary.

Distinctive characteristics include:

Bounded memory and computational cost, making deployment feasible on embedded and real-time hardware.
Theoretical error bounds derived from its inheritance of exact GP properties and the expert aggregation scheme, suitable for control-Lyapunov and provably-safe policy integration.
Adaptive, representative data selection via kernel similarity, upholding accuracy in non-stationary and data-rich, streaming contexts.

SkyGP-Dense contrasts sharply with traditional GP streaming approaches that either:

Discard the most recent data under memory constraints (potentially losing critical new information).
Depend on simple data partitioning without representativeness or adaptive weighting.
Forego uncertainty calibration in ensemble aggregation or incur unbounded memory growth.

By design, SkyGP-Dense provides a principled, kernel-driven mechanism for both representativeness (data closest to the expert center are retained) and adaptivity (triggered expert/point replacement) while guaranteeing bounded complexity and maintaining high predictive fidelity.

7. Limitations and Practical Considerations

While SkyGP-Dense substantially alleviates the scalability issue typical of GPs, its reliance on full Cholesky updates during replacement introduces a $O(N̄^3)$ cost per event, necessitating careful selection of $N̄$ for target hardware. The replacement strategy triggers only on critical events to amortize this cost. The framework presumes a kernel can be specified and efficiently evaluated for the application domain, as the kernel function's structure and parameterization are central to its performance.

In conclusion, SkyGP-Dense presents a rigorously validated, theoretically grounded framework for streaming GP inference, balancing the tradeoffs between memory, computation, predictive accuracy, and uncertainty quantification, and is particularly suitable for real-time learning and control under resource constraints (Yang et al., 5 Aug 2025).

PDF Markdown Chat (Pro)

References (1)

Streaming Generated Gaussian Process Experts for Online Learning and Control (2025)

Whiteboard

Generate a whiteboard explanation of this topic.

Topic to Video (Beta)

Generate a video overview of this topic.

Follow Topic

Get notified by email when new papers are published related to SkyGP-Dense.