Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

DAWN: Matrix Operation-Optimized Algorithm for Shortest Paths Problem on Unweighted Graphs (2208.04514v11)

Published 9 Aug 2022 in cs.DC

Abstract: The shortest paths problem is a fundamental challenge in graph theory, with a broad range of potential applications. The algorithms based on matrix multiplication exhibits excellent parallelism and scalability, but is constrained by high memory consumption and algorithmic complexity. Traditional shortest paths algorithms are limited by priority queues, such as BFS and Dijkstra algorithm, making the improvement of their parallelism a focal issue. We propose a matrix operation-optimized algorithm, which offers improved parallelism, reduced time complexity, and lower memory consumption. The novel algorithm requires $O(E_{wcc}(i))$ and $O(S_{wcc} \cdot E_{wcc})$ times for single-source and all-pairs shortest paths problems, respectively, where $S_{wcc}$ and $E_{wcc}$ denote the number of nodes and edges included in the largest weakly connected component in graph. To evaluate the effectiveness of the novel algorithm, we tested it using graphs from SuiteSparse Matrix Collection and Gunrock benchmark dataset. Our algorithm outperformed the BFS implementations from Gunrock and GAP (the previous state-of-the-art solution), achieving an average speedup of 3.769$\times$ and 9.410$\times$, respectively.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (52)
  1. Diameter of the world-wide web. nature 401, 6749 (1999), 130–131.
  2. On economical construction of the transitive closure of an oriented graph. In Doklady Akademii Nauk, Vol. 194. Russian Academy of Sciences, 487–488.
  3. Direction-optimizing breadth-first search. In SC’12: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. IEEE, 1–10.
  4. The GAP benchmark suite. arXiv preprint arXiv:1508.03619 (2015).
  5. Distributed memory breadth-first search revisited: Enabling bottom-up search. In 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum. IEEE, 1618–1627.
  6. Dimitri Bertsekas. 1998. Network optimization: continuous and discrete models. 8, 2 (1998), 91–112.
  7. Dimitri P Bertsekas and John N Tsitsiklis. 1991. An analysis of stochastic shortest path problems. Mathematics of Operations Research 16, 3 (1991), 580–595.
  8. Béla Bollobás. 1981. The diameter of random graphs. Trans. Amer. Math. Soc. 267, 1 (1981), 41–52.
  9. Distributed-memory breadth-first search on massive graphs. arXiv preprint arXiv:1705.04590 (2017).
  10. Federico Busato and Nicola Bombieri. 2015. An efficient implementation of the Bellman-Ford algorithm for Kepler GPU architectures. IEEE Transactions on Parallel and Distributed Systems 27, 8 (2015), 2222–2233.
  11. Timothy M Chan. 2012. All-pairs shortest paths for unweighted undirected graphs in o (mn) time. ACM Transactions on Algorithms (TALG) 8, 4 (2012), 1–17.
  12. Priority queues and dijkstra’s algorithm. (2007).
  13. Reuven Cohen and Shlomo Havlin. 2003. Scale-Free Networks Are Ultrasmall. Phys. Rev. Lett. 90 (Feb 2003), 058701. Issue 5. https://doi.org/10.1103/PhysRevLett.90.058701
  14. D. Coppersmith and S. Winograd. 1987. Matrix Multiplication via Arithmetic Progressions. In Proceedings of the Nineteenth Annual ACM Symposium on Theory of Computing (New York, New York, USA) (STOC ’87). Association for Computing Machinery, New York, NY, USA, 1–6. https://doi.org/10.1145/28395.28396
  15. Work-efficient parallel GPU methods for single-source shortest paths. In 2014 IEEE 28th International Parallel and Distributed Processing Symposium. IEEE, 349–359.
  16. Timothy A. Davis and Yifan Hu. 2011. The University of Florida Sparse Matrix Collection. ACM Trans. Math. Softw. 38, 1 (dec 2011).
  17. Julienne: A framework for parallel graph algorithms using work-efficient bucketing. In Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures. 293–304.
  18. Theoretically Efficient Parallel Graph Algorithms Can Be Fast and Scalable. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA).
  19. Edsger W Dijkstra et al. 1959. A note on two problems in connexion with graphs. Numerische mathematik 1, 1 (1959), 269–271.
  20. Stuart E Dreyfus. 1969. An appraisal of some shortest-path algorithms. Operations research 17, 3 (1969), 395–412.
  21. The cascade algorithm for finding all shortest distances in a directed graph. Management Science 14, 1 (1967), 19–28.
  22. Michael L Fredman and Robert Endre Tarjan. 1987. Fibonacci heaps and their uses in improved network optimization algorithms. Journal of the ACM (JACM) 34, 3 (1987), 596–615.
  23. Zvi Galil and Oded Margalit. 1997. All Pairs Shortest Distances for Graphs with Small Integer Length Edges. Information and Computation 134, 2 (1997), 103–139.
  24. Ayalvadi Ganesh and Feng Xue. 2007. On the connectivity and diameter of small-world networks. Advances in Applied Probability 39, 4 (2007), 853–863. https://doi.org/10.1239/aap/1198177228
  25. John L Gustafson. 1988. Reevaluating Amdahl’s law. Commun. ACM 31, 5 (1988), 532–533.
  26. Structural models: An introduction to the theory of directed graphs. 5, 1 (1965), 111–115.
  27. Donald B Johnson. 1977. Efficient algorithms for shortest paths in sparse networks. Journal of the ACM (JACM) 24, 1 (1977), 1–13.
  28. Ulrich Meyer and Peter Sanders. 2003. ΔΔ\Deltaroman_Δ-stepping: a parallelizable shortest path algorithm. Journal of Algorithms 49, 1 (2003), 114–152.
  29. Bernard ME Moret and Henry D Shapiro. 1992. An Empirical Assessment of Algorithms for Constructing a Minimum Spanning Tree. Computational Support for Discrete Mathematics 15 (1992), 99–117.
  30. Ketan Mulmuley and Pradyut Shah. 2000. A lower bound for the shortest path problem. In Proceedings 15th Annual IEEE Conference on Computational Complexity. IEEE, 14–21.
  31. Essentials of Parallel Graph Analytics. In Proceedings of the Workshop on Graphs, Architectures, Programming, and Learning (GrAPL 2022). 314–317. https://doi.org/10.1109/IPDPSW55747.2022.00061
  32. A Programming Model for GPU Load Balancing. In Proceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming (Montreal, QC, Canada) (PPoPP ’23). Association for Computing Machinery, New York, NY, USA, 79–91.
  33. Rajeev Raman. 1997. Recent results on the single-source shortest paths problem. ACM SIGACT News 28, 2 (1997), 81–87.
  34. Oliver Riordan et al. 2004. The diameter of a scale-free random graph. Combinatorica 24, 1 (2004), 5–34.
  35. A supernodal all-pairs shortest path algorithm. In Proceedings of the 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. 250–261.
  36. Scalable All-Pairs Shortest Paths for Huge Graphs on Multi-GPU Clusters. In Proceedings of the 30th International Symposium on High-Performance Parallel and Distributed Computing (Virtual Event, Sweden) (HPDC ’21). Association for Computing Machinery, New York, NY, USA, 121–131.
  37. Raimund Seidel. 1995. On the all-pairs-shortest-path problem in unweighted undirected graphs. Journal of computer and system sciences 51, 3 (1995), 400–403.
  38. Julian Shun. 2020. Practical parallel hypergraph algorithms. In Proceedings of the 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. 232–249.
  39. Julian Shun and Guy E Blelloch. 2013. Ligra: a lightweight graph processing framework for shared memory. In Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming. 135–146.
  40. Smaller and faster: Parallel processing of compressed graphs with Ligra+. In 2015 Data Compression Conference. IEEE, 403–412.
  41. Parallel Local Graph Clustering. Proceedings of the VLDB Endowment 9, 12 (2016).
  42. Volker Strassen et al. 1969. Gaussian elimination is not optimal. Numerische mathematik 13, 4 (1969), 354–356.
  43. Ganesh G Surve and Medha A Shah. 2017. Parallel implementation of bellman-ford algorithm using CUDA architecture. In 2017 International conference of Electronics, Communication and Aerospace Technology (ICECA), Vol. 2. IEEE, 16–22.
  44. Tadao Takaoka. 1998. Subcubic cost algorithms for the all pairs shortest path problem. Algorithmica 20, 3 (1998), 309–318.
  45. Frank W. Takes and Walter A. Kosters. 2011. Determining the Diameter of Small World Networks (CIKM ’11). Association for Computing Machinery, New York, NY, USA, 1191–1196. https://doi.org/10.1145/2063576.2063748
  46. Robert Endre Tarjan. 1983. Data structures and network algorithms. 1, 3 (1983), 39–45.
  47. Mikkel Thorup. 2000. On RAM priority queues. SIAM J. Comput. 30, 1 (2000), 86–109.
  48. Gary R Waissi. 1994. Network Flows: Theory, Algorithms, and Applications.
  49. SEP-Graph: finding shortest execution paths for graph processing under a hybrid framework on GPU. In Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming. 38–52.
  50. Gunrock: A High-Performance Graph Processing Library on the GPU. In Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (Barcelona, Spain) (PPoPP ’16). Association for Computing Machinery, New York, NY, USA, Article 11, 12 pages.
  51. Gunrock: GPU Graph Analytics. ACM Transactions on Parallel Computing 4, 1 (Aug. 2017), 3:1–3:49.
  52. Virginia Vassilevska Williams. 2012. Multiplying matrices faster than Coppersmith-Winograd. In Proceedings of the forty-fourth annual ACM symposium on Theory of computing. 887–898.
Citations (1)

Summary

We haven't generated a summary for this paper yet.