Optimal ordering of AoS/SoA conversion and CPU–GPU data movement operations

Determine the optimal ordering of narrowing, AoS-to-SoA conversion, host–device memory mapping, GPU kernel execution, device–host memory mapping, SoA-to-AoS conversion, and widening when offloading AoS-origin data to GPUs, particularly for tightly integrated CPU–GPU systems and potential in-situ conversions on the GPU, so as to minimize total runtime.

Background

The authors formalize a specific sequence for GPU offloading that includes narrowing to the accessed fields, converting AoS to SoA, mapping host data to the GPU, executing the GPU kernel, mapping results back, converting SoA to AoS, and widening to the original structure. They observe this order is natural due to memory-movement constraints but present alternative orderings for tightly integrated CPU–GPU systems, especially if in-situ conversions on the GPU become possible.

They explicitly state that it is not clear whether the presented order is optimal in all cases, motivating the need to establish optimal orderings under different hardware integration scenarios.

References

Given the observation that memory movements become a limiting factor in GPU utilisation, the order of operations in (equation) is natural. However, it is not clear that this order is optimal in all cases.

— Annotation-guided AoS-to-SoA conversions and GPU offloading with data views in C++ (2502.16517 - Radtke et al., 23 Feb 2025) in Section 4 Formal code transformations and algorithm description, Subsection GPU offloading

Optimal ordering of AoS/SoA conversion and CPU–GPU data movement operations

Background

References

Related Problems