SGD-Based OCSVM Solver
- The paper introduces SONAR, a SGD-based OCSVM solver that leverages strongly convex regularization and Random Fourier Features to achieve efficient streaming anomaly detection.
- The method reduces computational complexity while providing last-iterate guarantees on Type I/II error rates via a single-pass SGD algorithm.
- Empirical evaluations demonstrate robust adaptivity and performance on synthetic and real-world data under both benign and adversarial non-stationary conditions.
Stochastic Gradient Descent (SGD)-based One-Class Support Vector Machine (OCSVM) solvers provide an efficient approach to streaming outlier detection, overcoming key limitations of traditional kernel methods in both computational tractability and statistical guarantees. SONAR, as introduced by (Suk et al., 11 Dec 2025), leverages strongly convex regularization and Random Fourier Features (RFFs) to deliver last-iterate statistical guarantees on Type I/II error rates for single-pass, non-stationary data streams, with extensions enabling adaptive tracking under adversarial non-stationarity.
1. Classical OCSVM Formulation and Limitations
Standard kernel-based OCSVM methods (Schölkopf et al. 1999, 2001) solve an RKHS maximum-margin problem. The primal soft-margin OCSVM is formulated as
$\min_{w\in\mc H,\;\rho\in\R,\;\xi\ge0}\;\frac12\|w\|_{\mc H}^2\;-\;\rho\;+\;\frac1{\lambda T}\sum_{t=1}^T\xi_t \quad\text{s.t.}\;\langle w,\varphi(X_t)\rangle\ge\rho-\xi_t,\;\xi_t\ge0,$
where bounds the permitted fraction of outliers (Type I error). Alternatively, this can be written as the unconstrained penalty problem
$\min_{w,\rho}\; \frac12\|w\|_{\mc H}^2 - \rho + \frac1{\lambda T}\sum_{t=1}^T (\rho - \langle w, \varphi(X_t)\rangle)_+,$
which relies on full access to the Gram matrix . Such requirements are computationally prohibitive for streaming or one-pass settings. Furthermore, the objective lacks strong convexity, resulting in slow SGD convergence and no reliable last-iterate guarantees on outlier (decision) error.
2. Strongly Convex Reformulation with Random Fourier Features
SONAR addresses these limitations by approximating the RKHS kernel using RFFs, mapping . The infinite-dimensional norm is replaced with the Euclidean norm, and the objective is made strongly convex by augmenting with a quadratic term in : Here, is normalized such that is supported on the unit sphere or is RFF-normalized, with . The strong convexity (1-strongly convex in ) underpins rapid last-iterate convergence and uniform high-probability error controls within streaming regimes (Suk et al., 11 Dec 2025).
3. Single-Pass SGD Algorithm and Update Rules
For the streaming context, SONAR employs the following update rules for each new sample :
- Compute the instantaneous loss:
- Unbiased subgradient estimates:
- Diminishing step size:
- Closed-form one-pass SGD updates:
By induction, and at every step, obviating the need for explicit projection. This facilitates streaming operation with time-per-update, in contrast to training time and memory cost of standard kernel OCSVM solvers.
4. Theoretical Performance and Lifelong Guarantees
SONAR provides explicit, high-probability finite-sample guarantees for both Type I (false positive) and Type II (false negative) errors:
- Convergence of last iterate: With probability , after samples, satisfies
- Type I error (false positive) control: For the minimizer ,
and for the SGD iterate, with proper sample size and an shrinkage of the threshold, .
- Large-margin (Type II) guarantees: The learned margin satisfies
with —the support function margin—satisfying .
- Lifelong (transfer) learning: If the data distribution switches at , for all ,
plus a telescopic lower bound on the margin. This enables the algorithm to inherit the statistical performance from previous phases in the stream, rapidly adapting to benign distribution shifts (Suk et al., 11 Dec 2025).
5. Adaptation to Adversarial Non-Stationarity: SONARC
To handle fully adversarial, possibly abrupt non-stationarity, SONAR is embedded within a classical ensemble architecture, SONARC (“SONAR with Changepoint detection” Editor’s term), as follows:
- Maintains base learners, each resetting at dyadic epochs of length
- At each , compares with each base’s last reset point ; if
for any , a changepoint is detected and all learners are restarted on the remaining stream
- Safety and adaptivity theorems guarantee that changepoint detection is triggered only when the true underlying minimizer shifts, and within stationary phases of sufficient length, SONARC achieves matching Type I/II guarantees to those of an oracle with phase knowledge.
6. Empirical Validation and Computational Efficiency
Empirical evaluation on synthetic and real-world data (including the SKAB water-loop and Aposemat Malware-Capture datasets) demonstrates:
- Computational complexity: After mapping to RFFs, each update and evaluation is , contrasting with the cubic/memory-intensive complexity of standard OCSVM QP solvers.
- Type I error tracking: Cumulative online Type I error closely matches the user-specified . For example, SONAR maintains Type I error in SKAB.
- Type II error rates: SONAR achieves a final online Type II error of approximately on challenging cases, competitive with deep-learning baselines.
- Robust adaptivity: SONARC matches oracle-level false positive and margin guarantees in synthetic multi-phase streams, outperforming multivariate changepoint detection techniques that can falsely trigger on non-critical distributional shifts.
These empirical findings confirm that SONAR and its ensemble extension provide streaming, computation-efficient, and adaptively reliable anomaly detection, establishing the first guarantees for last-iterate errors, margin growth, and lifelong transfer under both benign and adversarial regime changes (Suk et al., 11 Dec 2025).