Model message-size dependence of inter-node bandwidth in analytical scalability models
Develop enhanced analytical performance models for broadcast and shuffle in distributed multi-GPU SQL processing that treat the effective inter-node bandwidth parameter B_n as a function of message size, thereby improving prediction accuracy as the number of machines increases and per-message sizes decrease.
References
Our models assume a constant $B_n$; however, in reality, $B_n$ varies with message size (see Figure~\ref{fig:broadcast-bw-HM}). This means that our models may become less accurate with an increasing number of nodes since the message size becomes smaller. Corrections of $B_n$ to the effect of message sizes are left as future work.
— Terabyte-Scale Analytics in the Blink of an Eye
(2506.09226 - Wu et al., 10 Jun 2025) in Section 6.3 (Performance Projection)