2000 character limit reached

SOOTT: Smoothed Online Optimization for Tracking

Updated 14 September 2025

SOOTT is a unified framework that integrates trajectory tracking, adversarial perturbation management, and smooth decision changes for real-time control.
The BEST algorithm achieves robust tracking by emulating an ideal informed strategy without requiring direct adversarial inputs.
The CoRT algorithm enhances performance by incorporating calibrated ML predictions, balancing consistency with robustness in dynamic environments.

Smoothed Online Optimization for Target Tracking (SOOTT) is a principled framework that unifies online trajectory tracking, robustness against adversarial perturbations, and control over decision smoothness via switching costs. SOOTT generalizes classical online convex optimization and control approaches by explicitly formulating the agent’s per-stage cost as a combination of tracking error, adversarial response, and penalization of abrupt decision changes. This formulation is well suited for applications such as real-time workload scheduling, adaptive resource allocation, and multi-target tracking, where simultaneous consideration of long-term performance, adversarial environments, and real-time adaptability is critical.

1. Core Problem Formulation and Cost Components

The SOOTT framework requires the agent to select actions $x_t \in \mathcal{X}$ at each time $t$ , incurring three distinct types of costs:

Tracking Cost: Quantifies the deviation of a (typically windowed) average of the agent’s actions from a prescribed dynamic trajectory target $\tau_t$ .

$\frac{1}{(w+1)} \| x_t + h_t - \tau_t (w + 1) \|^2$

where $h_t$ aggregates the previous $w$ actions.

Adversarial Perturbation Cost: Penalizes discrepancies between the action $x_t$ and an adversary’s dynamic (and possibly hostile) target $u_t$ , captured by a convex function $f_t$ .

$\lambda_1 f_t(x_t - u_t)$

Switching Cost: Penalizes abrupt changes in decisions to encourage smooth temporal evolution.

$\lambda_2 \| x_t - x_{t-1} \|^2$

The cumulative cost per round is thus:

$\text{Cost}_t(x_t, h_t) = \left\| \frac{x_t + h_t}{w+1} - \tau_t \right\|^2 + \lambda_1 f_t(x_t - u_t) + \lambda_2 \| x_t - x_{t-1} \|^2$

This formulation models, for instance, tracking a fluctuating service-level target ( $\tau_t$ ) while meeting unpredictable real-time demand spikes ( $u_t$ , e.g., inelastic jobs) and minimizing disruptive changes in resource allocation.

2. Robust Algorithmic Design: The BEST Algorithm

BEST (Backward Evaluation for Sequential Targeting) is a robust online algorithm for SOOTT that does not rely on knowledge of the adversarial target $u_t$ at decision time. Instead, BEST emulates the decisions of an idealized Informed Greedy Algorithm (IGA) (which knows $u_t$ ), by referencing IGA’s prior actions to drive its own update rule:

$x_t \leftarrow \arg\min_{x} \left\| \frac{x + \bar{h}_t}{w+1} - \tau_t \right\|^2 + \lambda_2 \| x - x_{t-1}^{\text{(IGA)}} \|^2$

where $\bar{h}_t$ accumulates IGA’s past decisions. The adversarial term is omitted since $u_t$ is not observable, making BEST robust to worst-case perturbations by design.

Theoretical guarantees are expressed using a degradation factor:

$\mathrm{DF}(\text{BEST}, \text{IGA}) \le 1 + \frac{\ell}{m} \frac{\eta^2 + 2 \lambda_1 \ell (1+\lambda_2)}{ \eta (\eta - m\lambda_1) }$

with

$\eta = \frac{2}{(w+1)^2} + m\lambda_1 + 2\lambda_2$

where $m$ and $\ell$ are problem-dependent parameters (e.g., the action dimension and window size). This performance bound ensures that even in adversarial environments, BEST incurs at most a constant-factor higher cost than the unattainable informed strategy.

3. Learning-Augmented Optimization: The CoRT Algorithm

CoRT (Consistent and Robust Tracking) extends BEST by incorporating untrusted predictions of the adversarial target $u_t$ , typically provided by black-box machine learning models (such as LSTMs). Rather than trusting predictions blindly, CoRT implements a calibration mechanism:

Let $\hat{u}_t$ denote the predicted adversarial target.
Define a calibrated target $\tilde{u}_t$ through a projection:

$\| \tilde{u}_t - x_t \| \le \theta D_t$

where $D_t$ is an upper bound on the current estimation error, and $\theta > 0$ controls the aggressiveness of exploiting the forecast.

CoRT then updates according to:

$\tilde{x}_t \leftarrow \arg\min_{x} \left\| \frac{x + \bar{h}_t}{w+1} - \tau_t \right\|^2 + \lambda_1 f_t(x - \tilde{u}_t) + \lambda_2 \| x - x_{t-1}^{(\text{IGA})} \|^2$

By choosing $\theta \approx 0$ , CoRT reduces to robust BEST; for large $\theta$ , it increasingly leverages the black-box predictions.

Consistency–Robustness Tradeoff: Theoretical analysis shows that CoRT strictly outperforms BEST when predictions are accurate (“consistency”) and gracefully degrades to BEST’s robust guarantee for inaccurate or adversarial predictions (“robustness”):

$\mathrm{DF}(\text{CoRT}, \text{IGA}) \le \mathrm{DF}(\text{BEST}, \text{IGA}) \cdot (1 + O(\theta^2))$

$\mathcal{C} \leq \psi(\theta) + (1-\psi(\theta)) \mathrm{DF}(\text{BEST},\text{IGA}) + \frac{2\lambda_1\lambda_2\ell^2}{ m \eta (\eta - m\lambda_1)} \cdot \frac{\theta^2}{1+\theta^2}$

where $\psi(\theta)$ is increasing from 0 to 1 as $\theta \to \infty$ .

4. Application: Workload Scheduling under Joint SOOTT Objectives

The SOOTT framework is instantiated in a case paper of dynamic resource allocation for mixed elastic/inelastic workloads modeled after real-world AI cluster operations. Here:

The tracking cost keeps long-term allocated resources aligned with SLA requirements ( $\tau_t$ ).
The adversarial cost penalizes misallocation relative to actual fluctuating inelastic demand ( $u_t$ ).
The switching cost limits frequent, abrupt reallocations.

Experimentation with Google Cluster Data demonstrates that:

Both BEST and CoRT can maintain trajectory tracking (via a windowed average) while avoiding excessive adaptation (switching).
CoRT exploits ML-based demand forecasts for inelastic workloads, leading to improved overall cost when prediction is accurate, yet remains robust to mispredictions due to the calibration/projection regime.

Parameter variation studies elucidate the core tradeoffs:

Parameter	Effect
$\lambda_1$	Up-weighting adversarial cost prioritizes accuracy in responding to inelastic workload, at possible expense of long-term tracking or smoothness
$\lambda_2$	Increases preference for smooth action sequences, deters rapid response to unexpected changes
$w$	Sets window for averaging, trading immediate responsiveness for stability in trajectory tracking
$\theta$	Controls confidence in predictive input in the learning-augmented setting; larger $\theta$ trusts forecasts more

The system’s total cost function—central to SOOTT—remains:

$\text{Cost}_t(x_t, h_t) = \left\| \frac{x_t + h_t}{w+1} - \tau_t \right\|^2 + \lambda_1 f_t(x_t - u_t) + \lambda_2 \| x_t - x_{t-1} \|^2$

5. Theoretical Performance Guarantees and Implications

SOOTT algorithms are evaluated via worst-case competitive analysis. The primary guarantee is the degradation factor (DF) for BEST and CoRT relative to IGA:

$\mathrm{DF}(\text{BEST}, \text{IGA}) \leq 1 + \frac{\ell}{m} \frac{\eta^2 + 2\lambda_1 \ell (1+\lambda_2)}{ \eta (\eta - m\lambda_1) }$

This guarantee is conservative and remains valid even as the problem parameters (e.g., cost weights, window size) vary, provided the cost parameters are well-conditioned ( $\eta - m\lambda_1 > 0$ ). When predictions are perfect, the learning-augmented CoRT achieves near-optimal consistency, while for arbitrary (even adversarial) predictions, its cost remains within a quantifiable multiple of the robust baseline.

6. Connections with Prior Work and Extensions

SOOTT generalizes and unifies existing online optimization and tracking methodologies:

It extends smoothed online learning frameworks (Zhang et al., 2021), augmenting windowed tracking with adversarial and switching penalties under quadratic and polyhedral cost structures.
The combination of provable guarantees (via calibration to a robust baseline) and the ability to exploit predictive ML models aligns with recent theory for learning-augmented online optimization.
SOOTT’s structure—explicit tracking, adversarial, and smoothing terms—makes it compatible with the design of resource allocation schemes, multi-target tracking controllers, and real-time scheduling frameworks.

Further research directions include incorporating adaptive recalibration of trust in predictive models, extending analysis to partial-information settings, and exploring distributed/decentralized SOOTT variants for multi-agent or multi-region tracking.

7. Summary Table: Comparative Properties of SOOTT, BEST, CoRT, and IGA

Algorithm	Requires $u_t$ ?	Uses Black-box Prediction	Robustness Guarantee	Consistency (when predictions are accurate)
IGA	Yes	No	Ideal benchmark	Attains lowest possible cost
BEST	No	No	DF ≤ explicit bound	Matches robust baseline
CoRT	No	Yes	DF ≤ BEST × (1 + O(θ²))	Approaches IGA as θ↑, remains robust

8. Conclusion

Smoothed Online Optimization for Target Tracking provides a rigorous, versatile foundation for online decision-making under uncertainty, balancing trajectory tracking, adversarial robustness, and control on decision smoothness. The theoretical and empirical results for BEST and CoRT illustrate both the necessity of robust smoothing and the benefit of integrating (projected) predictions. With its ability to manage nonstationary targets, adversarial disturbances, and dynamically evolving constraints, the SOOTT framework significantly broadens the toolkit for constructing high-reliability, low-regret online tracking systems in demanding real-world applications (Zeynali et al., 7 Sep 2025).

PDF Markdown Chat (Pro)

References (2)

Revisiting Smoothed Online Learning (2021)

Smoothed Online Optimization for Target Tracking: Robust and Learning-Augmented Algorithms (2025)

Follow Topic

Get notified by email when new papers are published related to Smoothed Online Optimization for Target Tracking (SOOTT).