Dynamic VM Type Selection (DVTS)

Updated 8 February 2026

DVTS is an adaptive approach that dynamically selects virtual machine types to efficiently match resource demands with workload fluctuations.
It leverages real-time telemetry, contextual bandit algorithms, and anomaly detection to make safe, cost-effective resizing decisions.
Empirical results demonstrate up to 35% vCPU and 60% RAM reductions while maintaining strict performance SLOs under variable workloads.

Dynamic VM Type Selection (DVTS) refers to the online, adaptive selection or adjustment of virtual machine (VM) types—or resource sizes—in response to time-varying cloud workload demands, performance requirements, and cost constraints. DVTS systematically optimizes VM sizing decisions (e.g., vCPU and RAM allocations) to achieve efficient resource usage, performance stability, and cost-effectiveness. The approach leverages real-time telemetry data, machine learning, and safe online learning paradigms to dynamically match VM capacity to workload and avoid both overprovisioning and underprovisioning. DVTS contrasts with static or preplanned allocation by integrating prediction, anomaly detection, and risk-aware actuation.

1. Problem Formulation and Optimization Objectives

DVTS is defined over a set of candidate VM types, each with associated resource capacities (CPU, memory), and operating costs. At each decision epoch (e.g., every 5 minutes), for each VM $v$ from the active set $V$ , an action $a_{v,t}\in A$ is chosen, where $A$ is a finite set of possible resource resizes (e.g., add/remove vCPU, add/remove RAM, or no-change). In “ADARES: Adaptive Resource Management for Virtual Machines” (Cano et al., 2018), the objective is to minimize the cumulative allocated resources (summed vCPU and RAM) over time while ensuring per-VM performance SLOs (avoiding CPU saturation, out-of-memory events, and swap thrashing). The reward $r_{v,t}$ for an action at time $t$ captures the trade-off between resource savings and performance penalties:

$r_{v,t} = f(\textrm{ResourceSaving}_{v,t},\ \textrm{PerformancePenalty}_{v,t})$

In the application-server autoscaling setting (Grozev et al., 2016), the aim is, when scaling out, to select the VM type $t^*$ which can support the largest concurrent user count $N_t$ at minimum cost per user, subject to the predicted resource loads not exceeding $t$ ’s normalized capacities:

$t^* = \arg\min_{t\in T}\frac{c_t}{N_t}, \text{ subject to}\ \hat{p}^{\textrm{CPU}}(N_t) \leq \textrm{cap}_t^{\textrm{CPU}},\ \hat{p}^{\textrm{RAM}}(N_t) \leq \textrm{cap}_t^{\textrm{RAM}}$

2. DVTS as an Online Learning and Decision-Making Problem

Modern DVTS frameworks cast the type selection or resizing process as a sequential online learning problem. In ADARES (Cano et al., 2018), this is formalized as a contextual linear bandit problem:

Context Vector: For each VM at time $t$ , the context $x_{v,t}\in\mathbb{R}^d$ aggregates cluster-level (total load, memory pressure), node-level (CPU steal, free DRAM), and VM-level (recent CPU, memory use, I/O wait) telemetry.
Action Set: The set $A$ encodes discrete resizes, e.g., $a_0$ = no-change, $a_1$ = remove 1 vCPU, $a_2$ = add 1 GB RAM.
Payoff Model: Each action $a$ has a parameter vector $\theta_a$ , with expected reward:

$\hat{r}(a\mid x_{v,t}) = \theta_a^T x_{v,t}$

Action Selection: A LinUCB-style index picks $a_{v,t} = \arg\max_{a \in A} \left[\theta_a^T x_{v,t} + \beta \sqrt{x_{v,t}^T A_a^{-1} x_{v,t}}\right]$ where $A_a$ is the design matrix and $\beta$ tunes the exploration-exploitation trade-off.

In horizontal scaling (Grozev et al., 2016), selection is driven by a regression model predicting per-VM CPU/RAM load as a function of concurrent users, filtered by real-time performance anomaly detection and capacity normalization. The selection loop greedily chooses the type minimizing cost per supported user, using an online-trained ANN for resource prediction.

3. Core Algorithms and Learning Mechanisms

Two main algorithmic regimes dominate DVTS instantiations:

Contextual Bandit with Transfer Learning (ADARES):
- Batch/Local and Global Priors: Each action’s $\theta_a$ is bootstrapped via global priors aggregated from other clusters to enable immediate “warm start” in new deployments and rapid convergence.
- Online Update: Upon observing the realized reward $r_{v,t}$ for the chosen action, only that arm’s statistics ( $A_{a_{v,t}}, b_{a_{v,t}}, \theta_{a_{v,t}}$ ) are updated via:
$A_{a_{v,t}} \leftarrow A_{a_{v,t}} + x_{v,t}x_{v,t}^T,\quad b_{a_{v,t}} \leftarrow b_{a_{v,t}} + r_{v,t}x_{v,t}$

$\theta_{a_{v,t}} \leftarrow A_{a_{v,t}}^{-1} b_{a_{v,t}}$ - Transfer Updates: Periodically, cluster-specific models are blended with the global prior:

$\theta^0_a \leftarrow \eta \theta^0_a + (1-\eta)\theta^{\textrm{loc}}_a,\quad 0<\eta<1$
Online Regression and Anomaly Filtering (Application-Server DVTS):
- Feature Extraction: Every 5 seconds, per-VM CPU/RAM load and user count are normalized and ingested.
- Anomaly Detection: Hierarchical Temporal Memory (HTM) assigns an anomaly score to incoming metric tuples, modulating the ANN’s learning rate and momentum for rapid adaptation after workload shifts.
- Dynamic Training: Bad samples (e.g., overload, idle, sudden jumps) are filtered; ANN retrains as valid data accrue, using an evolving schedule of learning rates and epochs.

4. Safe Exploration and Risk Management

Safe exploration is essential to prevent performance regressions during aggressive resource reduction or adaptation:

Trust-Region Constraints: Only allow at most one “step” (e.g. −1 vCPU or −512MB) resource change per epoch, enabling prompt rollback and containing impact under anomalous behavior (Cano et al., 2018).
Risk-Aware Filtering: Before any downsize, compute a conservative lower confidence bound (LCB) on the predicted reward. If the LCB falls below a risk threshold $\tau$ , the action is aborted to prevent SLO violations.
Anomaly-Driven Learning Rate: In horizontal scaling, sudden spikes in the HTM anomaly score drive rapid retraining, avoiding prolonged misprovisioning after flash crowds or application changes (Grozev et al., 2016).

5. Empirical Evaluation and Comparative Results

Empirical results highlight the efficacy and efficiency of DVTS-based systems under real-world cloud workloads:

Method	Δ vCPU	Δ RAM	SLO vio.
Threshold-based	−12%	−20%	0.5%
Prediction-based	−18%	−30%	0.9%
ADARES (no TL)	−25%	−45%	0.8%
ADARES (with TL)	−35%	−60%	0.9%

In ADARES (Cano et al., 2018), vCPU allocation was reduced by up to 35% (mean 28%), and RAM by up to 60% (mean 52%) compared to baseline policies, with an SLO violation rate below 1%.
Transfer learning enabled rapid convergence: a stable resizing policy is reached within 3–4 days on a new cluster, compared to over 10 days without transfer.
In horizontal scaling (Grozev et al., 2016), DVTS reduced total cost by approximately 20% relative to the best static baseline, while meeting SLA constraints throughout workload bursts and behavioral changes.
The adaptive selection of VM types (not just the scaling point) resulted in better cost efficiency and resource utilization than conventional threshold-based autoscaling.

6. System Architecture and Real-Time Operation

Decision Frequencies: In ADARES, resizing decisions are made every 5 minutes based on the latest cluster, node, and VM telemetry. In the application-server scenario, system metrics are updated and anomaly scores are recomputed every 5 seconds, with scaling decisions triggered continuously based on farm-wide CPU thresholds.
Middleware Orchestration: Upon scaling, the autoscaler provisions new VMs, updates load balancers based on normalized compute units, mounts shared storage, and initializes anomaly models from the most-trained peer (Grozev et al., 2016).
Data Collection and Feature Management: Lightweight agents extract metrics (e.g. via sar, mpstat, vmstat) and maintain per-VM histories for both local and cross-cluster model learning.
Portability: Both approaches avoid reliance on proprietary hypervisor hooks, requiring only in-VM OS metrics and neutral workload benchmarks.

7. Limitations and Outlook

Several technical limitations and open challenges are reported:

Multi-dimensional Coupling: Current methods (e.g., ADARES) treat CPU and memory resizing as independent bandits, ignoring latent coupling due to effects like swapping. Joint modeling of multi-resource spaces remains an open direction (Cano et al., 2018).
Workload Drift: Dynamic shifts in load patterns, application updates, or middleware changes require continuous model adaptation at both local and global levels.
Observation Noise: Rewards computed from instantaneous metrics may be noisy; robust regression and temporal smoothing can further suppress exploration spikes.
Extensions: While the DVTS framework can be extended to other resource dimensions (I/O, network bandwidth) in principle, practical deployment to these metrics is not yet demonstrated.
Generalizability: The prototype application-server DVTS (Grozev et al., 2016) is validated for PHP/Nginx workloads on AWS EC2, but domain-specific adaptation for other stacks or environments may require further tuning.

In summary, dynamic VM type selection operationalizes per-VM and per-cluster resource allocation as an adaptive, safe, and learnable control problem. Leveraging contextual bandits, anomaly-detected retraining, and transfer learning, DVTS systems yield significant resource and cost savings while maintaining stringent performance guarantees under variable workloads (Cano et al., 2018, Grozev et al., 2016).

Markdown Upgrade to Chat

References (2)

ADARES: Adaptive Resource Management for Virtual Machines (2018)

Dynamic Selection of Virtual Machines for Application Servers in Cloud Environments (2016)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Dynamic VM Type Selection (DVTS).

Dynamic VM Type Selection (DVTS)

1. Problem Formulation and Optimization Objectives

2. DVTS as an Online Learning and Decision-Making Problem

3. Core Algorithms and Learning Mechanisms

4. Safe Exploration and Risk Management

5. Empirical Evaluation and Comparative Results

6. System Architecture and Real-Time Operation

7. Limitations and Outlook

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research

Dynamic VM Type Selection (DVTS)

1. Problem Formulation and Optimization Objectives

2. DVTS as an Online Learning and Decision-Making Problem

3. Core Algorithms and Learning Mechanisms

4. Safe Exploration and Risk Management

5. Empirical Evaluation and Comparative Results

6. System Architecture and Real-Time Operation

7. Limitations and Outlook

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research