Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Design, Configuration, Implementation, and Performance of a Simple 32 Core Raspberry Pi Cluster (1708.05264v3)

Published 17 Aug 2017 in cs.DC

Abstract: In this report, I describe the design and implementation of an inexpensive, eight node, 32 core, cluster of raspberry pi single board computers, as well as the performance of this cluster on two computational tasks, one that requires significant data transfer relative to computational time requirements, and one that does not. We have two use-cases for the cluster: (a) as an educational tool for classroom usage, such as covering parallel algorithms in an algorithms course; and (b) as a test system for use during the development of parallel metaheuristics, essentially serving as a personal desktop parallel computing cluster. Our preliminary results show that the slow 100 Mbps networking of the raspberry pi significantly limits such clusters to parallel computational tasks that are either long running relative to data communications requirements, or that which requires very little internode communications. Additionally, although the raspberry pi 3 has a quad-core processor, parallel speedup degrades during attempts to utilize all four cores of all cluster nodes for a parallel computation, likely due to resource contention with operating system level processes. However, distributing a task across three cores of each cluster node does enable linear (or near linear) speedup.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (14)
  1. Budget beowulfs: A showcase of inexpensive clusters for teaching pdc. In Proceedings of the 46th ACM Technical Symposium on Computer Science Education, pages 344–345. ACM, 2015.
  2. The micro-cluster showcase: 7 inexpensive beowulf clusters for teaching pdc. In Proceedings of the 47th ACM Technical Symposium on Computing Science Education, pages 82–83. ACM, 2016.
  3. Communication-optimal parallel algorithm for strassen’s matrix multiplication. In Proceedings of the Twenty-fourth Annual ACM Symposium on Parallelism in Algorithms and Architectures, pages 193–204. ACM, 2012.
  4. Recursive array layouts and fast parallel matrix multiplication. In Proceedings of the Eleventh Annual ACM Symposium on Parallel Algorithms and Architectures, pages 222–231. ACM, 1999.
  5. Vincent A. Cicirello. Performance tests for small clusters. GitHub, August 2017a. Source code repository: https://github.com/cicirello/ClusterPerformanceTests.
  6. Vincent A. Cicirello. Variable annealing length and parallelism in simulated annealing. In Proceedings of the Tenth International Symposium on Combinatorial Search (SoCS 2017), pages 2–10. AAAI Press, June 2017b. 10.1609/socs.v8i1.18424.
  7. Introduction to Algorithms. MIT Press, 2009.
  8. Lesslie Hall, editor. Beowulf. D.C. Heath and Co., 1892. English translation.
  9. Matrix multiplication, a little faster. In Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures, pages 101–110. ACM, 2017.
  10. Suzanne J. Matthews. Teaching with parallella: A first look in an undergraduate parallel computing course. Journal of Computing Sciences in Colleges, 31(3):18–27, January 2016.
  11. David Neal. Determining sample sizes for monte carlo integration. The College Mathematics Journal, 24(3):254–259, 1993.
  12. Raspberry Pi Foundation. Raspberry pi: Teach, learn, and make with raspberry pi. Website, 2017. https://www.raspberrypi.org/.
  13. BEOWULF: A parallel workstation for scientific computation. In Proceedings of the 1995 International Conference on Parallel Processing, pages 11–14, 1995.
  14. Volker Strassen. Gaussian elimination is not optimal. Numer. Math., 13(4):354–356, 1969.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets