Papers
Topics
Authors
Recent
2000 character limit reached

$k$-ported vs. $k$-lane Broadcast, Scatter, and Alltoall Algorithms

Published 27 Aug 2020 in cs.DC | (2008.12144v1)

Abstract: In $k$-ported message-passing systems, a processor can simultaneously receive $k$ different messages from $k$ other processors, and send $k$ different messages to $k$ other processors that may or may not be different from the processors from which messages are received. Modern clustered systems may not have such capabilities. Instead, compute nodes consisting of $n$ processors can simultaneously send and receive $k$ messages from other nodes, by letting $k$ processors on the nodes concurrently send and receive at most one message. We pose the question of how to design good algorithms for this $k$-lane model, possibly by adapting algorithms devised for the traditional $k$-ported model. We discuss and compare a number of (non-optimal) $k$-lane algorithms for the broadcast, scatter and alltoall collective operations (as found in, e.g., MPI), and experimentally evaluate these on a small $36\times 32$-node cluster with a dual OmniPath network (corresponding to $k=2$). Results are preliminary.

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (1)

Collections

Sign up for free to add this paper to one or more collections.