Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Optimizing Communication for Latency Sensitive HPC Applications on up to 48 FPGAs Using ACCL (2403.18374v2)

Published 27 Mar 2024 in cs.DC and cs.AR

Abstract: Most FPGA boards in the HPC domain are well-suited for parallel scaling because of the direct integration of versatile and high-throughput network ports. However, the utilization of their network capabilities is often challenging and error-prone because the whole network stack and communication patterns have to be implemented and managed on the FPGAs. Also, this approach conceptually involves a trade-off between the performance potential of improved communication and the impact of resource consumption for communication infrastructure, since the utilized resources on the FPGAs could otherwise be used for computations. In this work, we investigate this trade-off, firstly, by using synthetic benchmarks to evaluate the different configuration options of the communication framework ACCL and their impact on communication latency and throughput. Finally, we use our findings to implement a shallow water simulation whose scalability heavily depends on low-latency communication. With a suitable configuration of ACCL, good scaling behavior can be shown to all 48 FPGAs installed in the system. Overall, the results show that the availability of inter-FPGA communication frameworks as well as the configurability of framework and network stack are crucial to achieve the best application performance with low latency communication.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (11)
  1. “Enabling Reconfigurable HPC through MPI-based Inter-FPGA Communication” In Proceedings of the 37th International Conference on Supercomputing, 2023, pp. 477–487
  2. “Scalable multi-FPGA design of a discontinuous Galerkin shallow-water model on unstructured meshes” In Proceedings of the Platform for Advanced Scientific Computing Conference, 2023, pp. 1–12
  3. “OpenCL-enabled Parallel Raytracing for Astrophysical Application on Multiple FPGAs with Optical Links” In 2020 IEEE/ACM International Workshop on Heterogeneous High-performance Reconfigurable Computing (H2RC), 2020, pp. 48–55 DOI: 10.1109/H2RC51942.2020.00011
  4. Z. He, D. Korolija and G. Alonso “EasyNet: 100 Gbps Network for HLS” In 2021 31st International Conference on Field-Programmable Logic and Applications (FPL) Los Alamitos, CA, USA: IEEE Computer Society, 2021, pp. 197–203 DOI: 10.1109/FPL53798.2021.00040
  5. “ACCL: FPGA-Accelerated Collectives over 100 Gbps TCP-IP” In 2021 IEEE/ACM International Workshop on Heterogeneous High-performance Reconfigurable Computing (H2RC), 2021, pp. 33–43 DOI: 10.1109/H2RC54759.2021.00009
  6. “Scaling Performance for N-Body Stream Computation with a Ring of FPGAs” In Proceedings of the 10th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, HEART ’19 Nagasaki, Japan: Association for Computing Machinery, 2019 DOI: 10.1145/3337801.3337813
  7. “Algorithm-hardware co-design of a discontinuous Galerkin shallow-water model for a dataflow architecture on FPGA” In Proceedings of the Platform for Advanced Scientific Computing Conference, 2021, pp. 1–11
  8. “GPU–FPGA-Accelerated Radiative Transfer Simulation with Inter-FPGA Communication” In Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region, HPC Asia ’23 Singapore, Singapore: Association for Computing Machinery, 2023, pp. 117–125 DOI: 10.1145/3578178.3578231
  9. Johannes Menzel, Christian Plessl and Tobias Kenter “The Strong Scaling Advantage of FPGAs in HPC for N-Body Simulations” In ACM Trans. Reconfigurable Technol. Syst. 15.1 New York, NY, USA: Association for Computing Machinery, 2021 DOI: 10.1145/3491235
  10. Marius Meyer, Tobias Kenter and Christian Plessl “Multi-FPGA Designs and Scaling of HPC Challenge Benchmarks via MPI and Circuit-switched Inter-FPGA Networks” In ACM Transactions on Reconfigurable Technology and Systems 16.2, 2023, pp. 1–27 DOI: 10.1145/3576200
  11. “XUP Vitis Network Example (VNX)” URL: https://github.com/Xilinx/xup_vitis_network_example

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com