Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
166 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

BCL: A Cross-Platform Distributed Container Library (1810.13029v3)

Published 30 Oct 2018 in cs.DC

Abstract: One-sided communication is a useful paradigm for irregular parallel applications, but most one-sided programming environments, including MPI's one-sided interface and PGAS programming languages, lack application level libraries to support these applications. We present the Berkeley Container Library, a set of generic, cross-platform, high-performance data structures for irregular applications, including queues, hash tables, Bloom filters and more. BCL is written in C++ using an internal DSL called the BCL Core that provides one-sided communication primitives such as remote get and remote put operations. The BCL Core has backends for MPI, OpenSHMEM, GASNet-EX, and UPC++, allowing BCL data structures to be used natively in programs written using any of these programming environments. Along with our internal DSL, we present the BCL ObjectContainer abstraction, which allows BCL data structures to transparently serialize complex data types while maintaining efficiency for primitive types. We also introduce the set of BCL data structures and evaluate their performance across a number of high-performance computing systems, demonstrating that BCL programs are competitive with hand-optimized code, even while hiding many of the underlying details of message aggregation, serialization, and synchronization.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (33)
  1. Chapel implementation of ISx. https://github.com/chapel-lang/chapel/tree/master/test/release/examples/benchmarks/isx. Accessed March 10, 2018.
  2. NERSC Meraculous benchmark. http://www.nersc.gov/research-and-development/apex/apex-benchmarks/meraculous/. Accessed March 2, 2018.
  3. STAPL beta release tutorial guide. 2017.
  4. The UPC++ PGAS library for exascale computing. In Proceedings of the Second Annual PGAS Applications Workshop, page 7. ACM, 2017.
  5. Burton H Bloom. Space/time trade-offs in hash coding with allowable errors. Communications of the ACM, 13(7):422–426, 1970.
  6. Dan Bonachea and P Hargrove. Gasnet specification, v1. 8.1. 2017.
  7. Multipol: A distributed data structure library. In PPoPP, 1995.
  8. Parallel programmability and the chapel language. The International Journal of High Performance Computing Applications, 21(3):291–312, 2007.
  9. Introducing OpenSHMEM: SHMEM for the PGAS community. In Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model, page 2. ACM, 2010.
  10. Meraculous: de novo genome assembly with short paired-end reads. PloS one, 6(8):e23501, 2011.
  11. X10: an object-oriented approach to non-uniform cluster computing. In Acm Sigplan Notices, volume 40, pages 519–538. ACM, 2005.
  12. UPC Consortium et al. UPC language specifications v1. 2. Lawrence Berkeley National Laboratory, 2005.
  13. Parallel computing works! Elsevier, 2014.
  14. DASH: A C++ PGAS library for distributed data structures and parallel algorithms. In Proceedings of the 18th IEEE International Conference on High Performance Computing and Communications (HPCC 2016), pages 983–990, Sydney, Australia, December 2016.
  15. HipMer: an extreme-scale de novo genome assembler. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, page 14. ACM, 2015.
  16. Parallel de bruijn graph construction and traversal for de novo genome assembly. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pages 437–448. IEEE Press, 2014.
  17. Enabling highly-scalable remote memory access programming with MPI-3 one sided. Scientific Programming, 22(2):75–91, 2014.
  18. ISx: A scalable integer sort for co-design in the exascale era. In Partitioned Global Address Space Programming Models (PGAS), 2015 9th International Conference on, pages 102–104. IEEE, 2015.
  19. A study of the bucket-exchange pattern in the PGAS model using the ISx integer sort mini-application. In PGAS Applications Workshop (PAW) at SC16, 2016.
  20. Maurice Herlihy. Wait-free synchronization. ACM Transactions on Programming Languages and Systems (TOPLAS), 13(1):124–149, 1991.
  21. HPX: A task based programming model in a global address space. In Proceedings of the 8th International Conference on Partitioned Global Address Space Programming Models, page 6. ACM, 2014.
  22. PapyrusKV: a high-performance parallel key-value store for distributed NVM architectures. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, page 57. ACM, 2017.
  23. Patterns for parallel programming. Pearson Education, 2004.
  24. Structured parallel programming: patterns for efficient computation. Elsevier, 2012.
  25. Efficient counting of k-mers in DNA sequences using a Bloom filter. BMC bioinformatics, 12(1):333, 2011.
  26. Global arrays: A nonuniform memory access programming model for high-performance computers. The Journal of Supercomputing, 10(2):169–189, 1996.
  27. Co-array fortran for parallel programming. In ACM Sigplan Fortran Forum, volume 17, pages 1–31. ACM, 1998.
  28. Fast and parallel mapping algorithms for irregular problems. The Journal of Supercomputing, 10(2):119–140, 1996.
  29. Cache-, hash-, and space-efficient Bloom filters. Journal of Experimental Algorithmics (JEA), 14:4, 2009.
  30. The STAPL Parallel Container Framework. In Proceedings of the 16th ACM Symposium on Principles and Practice of Parallel Programming, PPoPP ’11, pages 235–246, New York, NY, USA, 2011. ACM.
  31. Michele Weiland. Chapel, Fortress and X10: novel languages for HPC. EPCC, The University of Edinburgh, Tech. Rep. HPCxTR0706, 2007.
  32. Titanium: A high-performance Java dialect. Concurrency Practice and Experience, 10(11-13):825–836, 1998.
  33. UPC++: a PGAS extension for C++. In 28th IEEE International Parlalel and Distributed Processing Symposium, pages 1105–1114. IEEE, 2014.
Citations (18)

Summary

We haven't generated a summary for this paper yet.