Papers
Topics
Authors
Recent
2000 character limit reached

ElasticVR: Adaptive VR Resource Management

Updated 20 December 2025
  • ElasticVR is an end-to-end framework that integrates scalable 360° video tiling, elastic offloading, and multi-agent reinforcement learning to adapt VR streaming under hard latency constraints.
  • It employs both centralized (CPPG) and decentralized (IPPG/CTDE) DRL approaches to jointly optimize quality, response time, and energy usage across heterogeneous networks.
  • Performance benchmarks indicate up to +43.21% PSNR gain and -56.83% energy consumption relative to baselines, showcasing its potential for immersive, resource-constrained VR applications.

ElasticVR is an end-to-end framework for adaptive, elastic computation and resource management in multi-user, multi-connectivity wireless virtual reality (VR) systems. The framework integrates scalable 360° video tiling, elastic computational task offloading, and multi-agent deep reinforcement learning to optimize user-perceived quality of experience (QoE) and system efficiency under hard latency constraints. It enables the joint adaptation of both task fidelity and computation offloading strategy across heterogeneous edge-client architectures and volatile wireless networks (Badnava et al., 13 Dec 2025).

1. System Architecture and Scalable Tiling

ElasticVR is designed for scenarios where multiple VR users, each equipped with a head-mounted display (HMD) possessing a lightweight CPU and battery, interact with edge servers through several heterogeneous wireless channels (including 4G, 5G, and WiGig). The architecture supports two modes: local computation (on the HMD) or offloading to a nearby multi-access edge computing (MEC) server with constrained CPU budget (ZmecZ_{\mathrm{mec}}). The throughput of each channel cc for a user kk is time-varying (Rkt(c)R_k^t(c)), further averaged to Rk(c)R_k(c) over a task's transmission interval.

Elastic 360° video streaming is achieved by dividing each group-of-pictures (GoP) into an H×VH\times V grid of spatial tiles. For each tile (h,v)(h,v), a base-layer size bh,v(0)b_{h,v}(0) and LL enhancement-layer sizes bh,v(l)b_{h,v}(l) are precomputed. The current field of view (FOV), determined by head pose, yields a binary mask mh,vm_{h,v} (1 for viewport tiles). The elasticity index ek{0,1,,L}e_k\in\{0,1,\ldots,L\} controls task fidelity, yielding an input size:

Sk(ek)=h,vbh,v(0)+l=1ekh,vmh,vbh,v(l),S_k(e_k) = \sum_{h,v}b_{h,v}(0) + \sum_{l=1}^{e_k}\sum_{h,v}m_{h,v}b_{h,v}(l),

with computational intensity scaling linearly as Ik(ek)=βSk(ek)I_k(e_k)=\beta S_k(e_k) (cycles/bit).

Task execution time and energy depend on whether the task is run locally or offloaded. Local computation time is Tc(Sk,Ik,fkvr)=Sk(ek)Ik(ek)/fkvrT^c(S_k, I_k, f_k^{vr}) = S_k(e_k) I_k(e_k) / f_k^{vr}; offloaded computation time sums transmission, MEC processing, and result retrieval. The local CPU power is Ekc=κSkIk(fkvr)2E_k^c = \kappa S_k I_k (f_k^{vr})^2, with offload energy reflecting radio transmission and reception.

2. Joint Optimization of Quality, Latency, and Energy

ElasticVR formalizes the global system objective as a constrained, multi-metric optimization problem, balancing:

  • QoE: average viewport PSNR Q(e)=(1/K)kq(ek)Q(\vec e) = (1/K) \sum_k q(e_k),
  • Response Time: T(e,u)=(1/K)kTkr(ek,uk)T(\vec e, \vec u) = (1/K) \sum_k T_k^r(e_k, u_k),
  • Energy Consumption: E(e,u)=(1/K)kEktot(ek,uk)E(\vec e, \vec u) = (1/K) \sum_k E_k^{tot}(e_k, u_k),

where eke_k indexes task elasticity and uku_k the offloading link. The scalarized objective is:

maxe,uQTE(e,u)=w0Q(e)w1T(e,u)w2E(e,u)\max_{\vec e, \vec u} QTE(\vec e, \vec u) = w_0 Q(\vec e) - w_1 T(\vec e, \vec u) - w_2 E(\vec e, \vec u)

subject to per-task deadlines Tkr(ek,uk)TkdT_k^r(e_k,u_k)\leq T_k^d and MEC CPU constraints k:uk>0z(fkmec)Zmec\sum_{k:u_k>0} z(f^{mec}_k) \leq Z_{\mathrm{mec}}.

3. Multi-Agent Reinforcement Learning Approaches

ElasticVR frames this problem as a multi-agent Markov decision process. At each GoP, every user kk selects an action atk=(ekt,ukt)a_t^k=(e_k^t,u_k^t) based on local observations. The framework introduces two solution paradigms:

  • Centralized Phasic Policy Gradient (CPPG): Implements centralized actor-critic DRL with the full global state st=[st1,,stK]s_t = [s_t^1,\ldots,s_t^K] and joint action at=[at1,,atK]a_t = [a_t^1,\ldots,a_t^K]. The reward incorporates the QoE, response time, energy, and deadline violation penalties. The actor's and critic's losses follow proximal policy optimization (PPO) with phasic updates. CPPG exploits all cross-user couplings (e.g., shared CPU), achieving near-optimal coordination but suffers from O(K)O(K) state/action scaling and resultant training and inference overhead, limiting scalability for large user groups.
  • Independent Phasic Policy Gradient (IPPG): Adopts a centralized training, decentralized execution (CTDE) paradigm. Each agent (user) observes only its local state and acts independently at execution, but shares parameters and pooled experience during training. This ensures O(1)O(1) per-user execution overhead and high sample efficiency via parameter sharing. The local PPO objective maintains global reward coupling, but state observability is partial at runtime. Empirical results indicate IPPG retains near-optimal performance as KK increases, whereas CPPG throughput degrades for K5K \gtrsim 5.

4. Training and Execution Paradigms

Two execution models correspond to the solution variants:

  • CTCE (CPPG): Both training and inference are conducted at the MEC. Each HMD uploads its local state stks_t^k at every GoP; MEC computes the global joint action. This configuration guarantees maximal coordination, but introduces significant uplink overhead, single-point failure risk, and limited scalability.
  • CTDE (IPPG): Centralized training occurs at the MEC with all users' experience pooled for periodic parameter updates. Execution is fully decentralized: each HMD runs its local policy using current parameters and only occasionally communicates with the MEC for retraining. This decentralized design dramatically lowers communication cost and removes inference bottlenecks, though it forgoes explicit cross-user coordination at runtime.

5. Performance Benchmarking and Comparative Evaluation

ElasticVR’s efficacy is validated on public 360° video datasets and real 4G/5G/WiGig traces. Comparative evaluation includes:

  • Pareto-optimal brute-force reference
  • Neural epsilon-greedy joint learning
  • Elasticity-agnostic PPO (EA-Offloader) baseline (ek=4e_k=4, optimizing only offloading decision)

Key results (for K=3K=3, weights w0=0.35w_0=0.35, w1=0.85w_1=0.85, w2=0.15w_2=0.15):

Method PSNR (dB) RT (s) Energy (mJ) Deadline Viol.
Optimal 49.24±1.03 0.57±0.07 1.80±0.98 0%
CPPG 48.35±0.80 0.49±0.13 1.01±0.74 0%
IPPG 48.14±0.57 0.53±0.03 0.61±0.04 0%
EA-Offloader 38.63±16.44 0.85±0.35 2.34±0.92 32.4%
ε-Greedy 33.81±15.68 0.85±0.73 2.06±2.39 40.7%

ElasticVR (CPPG/IPPG) delivers +43.21%+43.21\% PSNR gain, 42.35%-42.35\% response time, and 56.83%-56.83\% energy consumption relative to the elasticity-agnostic baseline. PSNR–RT–Energy trade-off trajectories (Pareto front) confirm that adaptive elasticity and learned offloading enable performance within 5% of the global optimum. Fixed eke_k values result in either poor QoE–energy tradeoffs or massive deadline violations; only adaptive elasticity consistently achieves sub-12% deadline violations while balancing all metrics.

6. Adaptation, Scalability, and Limitations

ElasticVR’s adaptive tiling and multi-agent DRL enable robust performance across a range of network and computational conditions, including dynamic channel variation and variable user counts. The decentralized (IPPG/CTDE) variant ensures practical scalability and resilience against single point failures. A plausible implication is that, for small user groups, centralized CPPG may yield marginally superior coordination, but CTDE/parameter-sharing techniques become essential as network scale grows and environments become more heterogeneous.

A known limitation is the partial observability under decentralized execution, which may yield modest suboptimality in low-KK deployments. The framework assumes the elasticity index can be adjusted per GoP and that per-tile enhancement layers are precomputed and available.

7. Significance and Application Contexts

ElasticVR addresses the critical challenge of real-time, immersive 360° VR streaming in multi-user, resource-constrained, and bandwidth-variable environments. The integration of elastic 360° tiling with DRL-based offloading mechanisms offers a generalizable approach to maximizing application-level QoE while conforming to stringent latency and energy budgets. This approach is directly applicable to arenas, collaborative VR platforms, and future wireless edge architectures demanding scalable, adaptive resource provisioning and high-fidelity/user-centric experiences. Methods and results have influenced contemporary directions in elastic VR task computing and adaptive streaming research, unifying resource allocation with perceptual optimization (Badnava et al., 13 Dec 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Whiteboard

Follow Topic

Get notified by email when new papers are published related to ElasticVR Framework.