Explain ThunderX2 Imbalance in Ondes3D
Determine the cause of the observed load imbalance and longer overall execution time of the Ondes3D seismic wave simulator on the ARM ThunderX2 99xx architecture, despite the CPML4 microkernel executing faster on ThunderX2 than on Intel Skylake, by identifying which specific kernels or domain regions (intermediates, stress, velocity; Physical Domain, Absorbing Boundary Conditions, Free Surface) are responsible for the imbalance and performance degradation.
References
Our microkernels investigation shows that ThunderX2 is faster than Intel, leaving an open research issue for future studies on its imbalance.
— Temporal Load Imbalance on Ondes3D Seismic Simulator for Different Multicore Architectures
(2409.11392 - Solórzano et al., 17 Sep 2024) in Section VI (Discussion and Conclusion)