CloudSim Plus Simulation Framework
- CloudSim Plus is a discrete-event simulation framework with modular abstractions for modeling IaaS cloud environments.
- It introduces dynamic spot instance lifecycle modeling to simulate preemptible resources with interruption, hibernation, and termination states.
- The framework supports custom resource allocation algorithms and detailed event-driven tracing for robust cloud research insights.
CloudSim Plus is a discrete-event, object-oriented simulation framework for modeling Infrastructure-as-a-Service (IaaS) cloud environments, implemented in Java. It provides modular and extensible abstractions to simulate datacenters, hosts, brokers, virtual machines (VMs), and cloudlet-based workloads, facilitating research on resource allocation, scheduling, and market-driven cloud behaviors. Recent extensions introduced by Goldgruber et al. (Goldgruber et al., 22 Nov 2025) enhance the framework with dynamic spot-instance lifecycle modeling, enabling precise simulation of preemptible resources as found in commercial clouds.
1. Architecture and Core Components
CloudSim Plus is organized into two principal layers:
- Simulation Kernel: Manages event dispatching through the Future Event List (FEL), simulation time progression, and deferred event queues. The central class,
CloudSim, orchestrates simulation initialization, FEL management, and termination. - Infrastructure Model: Abstracts cloud resources via components such as
DatacenterSimple(clusters hosts and allocation policies),HostSimple(models physical servers),VmSimple(VM abstraction), andCloudletSimple(task abstraction). - Key Entities:
SimEntityserves as the supertype for all active simulation objects, exposing methods for event transmission (sendEvent()), event processing, and time-based scheduling.DatacenterBrokerAbstractimplements broker-side logic: VM and cloudlet orchestration, submissions, and result aggregation.- Allocation and scheduling behaviors adhere to the Strategy pattern via
VmAllocationPolicyAbstract, which researchers can subclass to evaluate custom heuristics. - A robust tagging system (
CloudSimTags) enables differentiated event processing for VM lifecycle (creation, destruction, cloudlet execution). - Multiple design patterns (Observer, Factory, Builder) underpin monitoring (via event listeners such as
onClockTickoronVmAllocation), configuration, and output table generation (TableBuilderAbstract).
This architecture supports extensive modification and experimentation, making CloudSim Plus a foundation for evaluating cloud resource-management strategies under various workload and policy assumptions.
2. Extensions for Dynamic Spot Instance Modeling
Goldgruber et al. introduce specialized constructs to simulate the volatile behavior of spot instances and on-demand VMs:
- Lifecycle States: Dynamic VMs (instances of
DynamicVm, subclassingVmSimple) progress through the following states: WAITING (queued), RUNNING (active), INTERRUPTED (deallocated due to capacity pressure or market events), TERMINATED (immediate deletion), and optionally HIBERNATED (paused until capacity allows resumption or timeout expires). - Spot Instance Parameters: The
SpotInstanceclass is extended to encodeinterruptionBehavior(TERMINATE/HIBERNATE),warningTime(grace period before interruption),hibernationTimeout, andpersistentRequestattributes. The latter facilitates repeated attempts to fulfill unallocated spot requests (subject tomaxWaitingTime). - Event Handling: Spot instance interruptions and terminations are enacted by dispatching events (
VM_DESTROY, customSPOT_HIBERNATEtags) through the kernel, leveraging the modular event-driven architecture. Hibernation and reallocation processes are managed by scheduled events and periodic checks (onClockTickListener). - Broker Logic: The
DatacenterBrokerDynamicmaintains a resubmission list (resubmittingList) to automatically retry failed or interrupted spot allocations, invoking callbacks for hibernation completion or reallocation based on broker policy. - Allocation Policy Extension:
DynamicAllocationsubclasses the policy abstraction to implement priority-based coercion, attempting to place new VMs by optionally deallocating spot VMs according to their interruption settings. This supports market-aware host deallocation and preemption.
| Component/Class | Extension Purpose | Key Behaviors/Fields |
|---|---|---|
| DatacenterBrokerDynamic | Spot VM lifecycle mgmt | Re-submission, interruption, hibernation logic |
| SpotInstance / OnDemandInstance | VM type distinction | Interruption settings, persistent request support |
| DynamicAllocation | Allocation with preemption | spotAllocation(), terminationBehavior(), host scans |
This extension enables fine-grained simulation of market-risk-related phenomena (e.g., interruption, hibernation, delayed fulfillment), paving the way for cost-reliability tradeoff studies.
3. Mathematical Models and Allocation Algorithms
Spot-price volatility is modeled implicitly via demand–supply-based interruption rates, potentially parameterized with real-world statistics (e.g., AWS Spot Advisor’s interruption frequency for instance type %%%%1%%%%; ).
The HLEM-VMP (Host Load Entropy Minimization–VM Placement) allocation algorithm, adapted for spot markets, operates as follows:
- Host Filtering:
Hosts are considered if .
- Normalization and Scoring:
Scores incorporate multi-resource entropy, penalizing imbalance.
- Spot-Load Adjustment:
Spot-load penalty is applied to discourage excessive spot concentrations per host.
This suggests the framework can evaluate the resilience of allocation heuristics against market-driven volatility.
The computational complexity is , supporting large-scale simulations.
4. Validation and Evaluation Methodology
Framework extensions were validated using synthetic experiments and large-scale cluster traces:
- Synthetic Scenarios:
- Diverse host types (Small, Medium, Large, X-Large) with parameterized CPU, RAM, and bandwidth.
- VM profiles spanning 1–10 vCPUs, 1–8 GB RAM, 100–1,000 Mbps BW, and 10,000–80,000 MB storage.
- 2,000 VMs (400 spot, 600 on-demand, 1,000 delayed submissions), randomized run lengths but fixed across repetitions.
- Google Cluster Trace Evaluation:
- Mapping 48 million tasks onto synthetic VMs (12,600 machines), plus injection of 200,000 spot instances (fixed durations).
- Utilization of an extended
TraceReaderfor mapping, event handling (EVICT/FAIL), and specification completion. - Simulations operated for up to two days, producing detailed CSV/JSON execution histories, interruption logs, and lifecycle tables.
| Scenario | Scale | Outputs |
|---|---|---|
| Synthetic (parameterized) | 2,000 VMs, 100 hosts | Interruption counts, durations, VM stats, allocation logs |
| Google Trace (~1–2 days) | 28.8M VMs, 12.6k hosts | Detailed, large-scale trace-driven execution tables |
A plausible implication is that this enables comprehensive benchmarking of allocation strategies under realistic and synthetic cloud market dynamics.
5. Comparative Algorithm Performance and Metrics
Metrics collected include:
- Number of spot interruptions
- Average and maximum interruption durations per VM
- Optional cost savings: (sum of runtime × spot price) vs. on-demand baselines
Comparative outcomes (synthetic run):
- First-Fit: 286 interruptions; avg. 22.81 s; max 64.87 s
- HLEM-VMP: 230 interruptions; avg. 21.12 s; max 49.49 s
- HLEM-VMPAdjusted: 205 interruptions; avg. 25.20 s; max 45.65 s
These results suggest that the HLEM-VMP algorithm and its spot-market adaptation decrease both the interruption frequency and the worst-case interruption length.
6. Practical Usage and Integration
Integration into research or production-grade simulation projects proceeds as follows:
- Repository: Clone https://github.com/CGoldi/cloudsimplus6-spot-instance
- Integration: Add JAR/source to classpath; use classes such as
DatacenterBrokerDynamicandDynamicAllocationHLEM(or HLEMAdjusted). - Instantiation: Create simulation environments via standard CloudSim Plus classes, leveraging new spot/on-demand VM types.
A minimal instantiation sequence (Java):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
// Initialize simulation CloudSim simulation = new CloudSim(0.5); simulation.terminateAt(70); // Provision hosts and datacenter HostDynamic host = new HostDynamic(2048,10000,1_000_000, List.of(new PeSimple(1000))); VmAllocationPolicyDynamic allocPolicy = new DynamicAllocationHLEM(); Datacenter datacenter = new DatacenterSimple(simulation, List.of(host), allocPolicy); // Create broker and VMs (spot/on-demand) DatacenterBrokerDynamic broker = new DatacenterBrokerDynamic(simulation); SpotInstance spotVm = new SpotInstance(1000,2,true); spotVm.setRam(512).setBw(1000).setSize(10_000) .setInterruptionBehavior(SpotInstance.InterruptionBehavior.HIBERNATE); OnDemandInstance ondVm = new OnDemandInstance(1000,2,true); ondVm.setRam(512).setBw(1000).setSize(10_000) .setSubmissionDelay(10); // Create cloudlets and submit Cloudlet cloudlet1 = new CloudletSimple(1,20000).setVm(spotVm) .setUtilizationModel(new UtilizationModelFull()); broker.submitVm(spotVm).submitCloudlet(cloudlet1); broker.submitVm(ondVm).submitCloudlet(cloudlet2); // Event listener and table export simulation.addOnClockTickListener(evt -> { for(Vm vm : broker.getVmExecList()) vm.updateProcessing(simulation.clock(), vm.getHost().getVmScheduler().getAllocatedMips(vm)); }); simulation.start(); new DynamicVmTableBuilder(broker.getVmFinishedList()).build(); |
Practical observations:
- Full backward compatibility is maintained; no kernel modifications are required.
- Per-VM interruption policies support flexible "what-if" analysis of hibernation vs. termination.
- Researchers may encounter memory or CPU bottlenecks at large scales; data may be partitioned or processed in parallel.
- The event/listener design facilitates plug-in of alternative heuristics (e.g., ML-driven scheduling).
- While explicit spot price time-series are not modeled, the extension can ingest real price/interruption traces for stochastic event generation.
7. Research Implications and Prospects
By integrating interruption-aware spot modeling with established trace datasets and extensible brokers, CloudSim Plus advances simulation fidelity for dynamic marketspaces. It supports systematic evaluation of scheduling strategies, quantifies cost–reliability trade-offs, and enables exploration of emerging IaaS pricing and allocation paradigms under volatility (Goldgruber et al., 22 Nov 2025). This framework thus contributes foundational capabilities for academic and industry research into robust cloud resource management and market-driven workload analysis.