Combine Vortex’s IO redistribution with intra-GPU slicing for higher system efficiency
Develop and evaluate techniques that integrate Vortex’s IO redistribution across GPUs with intra-GPU slicing (e.g., partitioning a single GPU into slices) to collocate complementary workloads and further improve overall system efficiency in multi-tenant environments.
References
{'s IO redistribution idea at the GPU level, can be combined with GPU slicing at sub-GPU granularity for even higher overall system efficiency, which we leave as future work.
— Vortex: Overcoming Memory Capacity Limitations in GPU-Accelerated Large-Scale Data Analytics
(2502.09541 - Yuan et al., 13 Feb 2025) in Section 10: Related Work, GPU slicing for workload collocation