Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 167 tok/s
Gemini 2.5 Pro 48 tok/s Pro
GPT-5 Medium 33 tok/s Pro
GPT-5 High 40 tok/s Pro
GPT-4o 92 tok/s Pro
Kimi K2 193 tok/s Pro
GPT OSS 120B 425 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Edge-Disjoint Spanning Trees on Star-Product Networks (2403.12231v2)

Published 18 Mar 2024 in cs.NI, cs.DC, and math.CO

Abstract: A star-product operation may be used to create large graphs from smaller factor graphs. Network topologies based on star-products demonstrate several advantages including low-diameter, high scalability, modularity and others. Many state-of-the-art diameter-2 and -3 topologies~(Slim Fly, Bundlefly, PolarStar etc.) can be represented as star products. In this paper, we explore constructions of edge-disjoint spanning trees~(EDSTs) in star-product topologies. EDSTs expose multiple parallel disjoint pathways in the network and can be leveraged to accelerate collective communication, enhance fault tolerance and network recovery, and manage congestion. Our EDSTs have provably maximum or near-maximum cardinality which amplifies their benefits. We further analyze their depths and show that for one of our constructions, all trees have order of the depth of the EDSTs of the factor graphs, and for all other constructions, a large subset of the trees have that depth.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (30)
  1. HyperX: Topology, Routing, and Packaging of Efficient Large-Scale Networks. In Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis (Portland, Oregon) (SC ’09). Association for Computing Machinery, New York, NY, USA, Article 41, 11 pages. https://doi.org/10.1145/1654059.1654101
  2. Extremely large minibatch sgd: Training resnet-50 on imagenet in 15 minutes. arXiv preprint arXiv:1711.04325 (2017).
  3. Paley graphs have Hamilton decompositions. Discrete Mathematics 312, 1 (2012), 113–118. https://doi.org/10.1016/j.disc.2011.06.003 Algebraic Graph Theory — A Volume Dedicated to Gert Sabidussi on the Occasion of His 80th Birthday.
  4. Jordi Arjona Aroca and Antonio Fernández Anta. 2014. Bisection (Band)Width of Product Networks with Application to Data Centers. IEEE Transactions on Parallel and Distributed Systems 25, 3 (2014), 570–580. https://doi.org/10.1109/TPDS.2013.95
  5. On edge-disjoint spanning trees in hypercubes. Inform. Process. Lett. 70, 1 (1999), 13–16.
  6. Large graphs with given degree and diameter III. Ann. of Discrete Math. 13 (1982), 23–32.
  7. Maciej Besta and Torsten Hoefler. 2014. Slim Fly: A Cost Effective Low-Diameter Network Topology. In Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis (New Orleans, LA, USA). Association for Computing Machinery, New York, NY, USA.
  8. W. G. Brown. 1966. On Graphs that do not Contain a Thomsen Graph. Can. Math. Bull. 9, 3 (1966), 281–285. https://doi.org/10.4153/CMB-1966-036-2
  9. Independent Spanning Trees in Networks: A Survey. ACM Comput. Surv. 55, 14s, Article 335 (jul 2023), 29 pages. https://doi.org/10.1145/3591110
  10. K. Day and A.-E. Al-Ayyoub. 2000. Minimal fault diameter for highly resilient product networks. IEEE Transactions on Parallel and Distributed Systems 11, 9 (2000), 926–930. https://doi.org/10.1109/71.879775
  11. Paul Erdős and Alfred Rényi. 1962. On a problem in the theory of graphs. Publ. Math. Inst. Hungar. Acad. Sci. 7A (1962), 623–641.
  12. Paraskevi Fragopoulou and Selim G. Akl. 1996. Edge-disjoint spanning trees on the star network with applications to fault tolerance. IEEE Trans. Comput. 45, 2 (1996), 174–185.
  13. Paul R. Hafner. 2004. Geometric realisation of the graphs of McKay–Miller–Širáň. Journal of Combinatorial Theory, Series B 90, 2 (2004), 223–232. https://doi.org/10.1016/j.jctb.2003.07.002
  14. Resource placement in Cartesian product of networks. J. Parallel and Distrib. Comput. 70, 5 (2010), 481–495. https://doi.org/10.1016/j.jpdc.2009.06.005
  15. Efficient deadlock-free multi-dimensional interval routing in interconnection networks. In Distributed Computing, Shay Kutten (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 273–287.
  16. Constructing edge-disjoint spanning trees in product networks. IEEE Transactions on Parallel and Distributed Systems 14, 3 (2003), 213–221. https://doi.org/10.1109/TPDS.2003.1189580
  17. In-Network Allreduce with Multiple Spanning Trees on PolarFly. In Proceedings of the 35th ACM Symposium on Parallelism in Algorithms and Architectures (Orlando, FL, USA) (SPAA ’23). Association for Computing Machinery, New York, NY, USA, 165–176. https://doi.org/10.1145/3558481.3591073
  18. PolarStar: Expanding the Scalability Horizon of Diameter-3 Networks. arXiv:2302.07217 [cs.NI]
  19. Bundlefly: A Low-Diameter Topology for Multicore Fiber. In Proceedings of the 34th ACM International Conference on Supercomputing (Barcelona, Spain) (ICS ’20). Association for Computing Machinery, New York, NY, USA, Article 20, 11 pages. https://doi.org/10.1145/3392717.3392747
  20. The generalized connectivity of complete bipartite graphs. Ars Comb. 104 (2010), 65–79. https://api.semanticscholar.org/CorpusID:14240977
  21. A Note on Large Graphs of Diameter Two and Given Maximum Degree. Journal of Combinatorial Theory, Series B 74, 1 (1998), 110–118. https://doi.org/10.1006/jctb.1998.1828
  22. Efficient large-scale language model training on gpu clusters using megatron-lm. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. 1–15.
  23. E.M. Palmer. 2001a. On the spanning tree packing number of a graph: a survey. Discrete Mathematics 230, 1 (2001), 13–21. https://doi.org/10.1016/S0012-365X(00)00066-2 Catlin.
  24. E.M. Palmer. 2001b. On the spanning tree packing number of a graph: a survey. Discrete Mathematics 230, 1 (2001), 13–21. https://doi.org/10.1016/S0012-365X(00)00066-2 Catlin.
  25. Language modeling at scale. In 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 590–599.
  26. James Roskind and Robert E Tarjan. 1985. A note on finding minimum-cost edge-disjoint spanning trees. Mathematics of Operations Research 10, 4 (1985), 701–708.
  27. James Anthony Roskind. 1983. Edge disjoint spanning trees and failure recovery in data communication networks. Ph. D. Dissertation. Massachusetts Institute of Technology.
  28. Alexander Sergeev and Mike Del Balso. 2018. Horovod: fast and easy distributed deep learning in TensorFlow. arXiv preprint arXiv:1802.05799 (2018).
  29. Abdou Youssef. 1991. Cartesian Product Networks. In International Conference on Parallel Processing. https://api.semanticscholar.org/CorpusID:8249681
  30. Bandwidth Optimal Pipeline Schedule for Collective Communication. arXiv preprint arXiv:2305.18461 (2023).

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com
Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 4 tweets and received 1 like.

Upgrade to Pro to view all of the tweets about this paper: