Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

On the random access performance of Cell Broadband Engine with graph analysis application (1105.5881v2)

Published 30 May 2011 in cs.CE and cs.PF

Abstract: The Cell Broad Engine (BE) Processor has unique memory access architecture besides its powerful computing engines. Many computing-intensive applications have been ported to Cell/BE successfully. But memory-intensive applications are rarely investigated except for several micro benchmarks. Since Cell/BE has powerful software visible DMA engine, this paper studies on whether Cell/BE is suit for applica- tions with large amount of random memory accesses. Two benchmarks, GUPS and SSCA#2, are used. The latter is a rather complex one that in representative of real world graph analysis applications. We find both benchmarks have good performance on Cell/BE based IBM QS20/22. Com- pared with 2 conventional multi-processor systems with the same core/thread number, GUPS is about 40-80% fast and SSCA#2 about 17-30% fast. The dynamic load balanc- ing and software pipeline for optimizing SSCA#2 are intro- duced. Based on the experiment, the potential of Cell/BE for random access is analyzed in detail as well as its limita- tions of memory controller, atomic engine and TLB manage- ment.Our research shows although more programming effort are needed, Cell/BE has the potencial for irregular memory access applications.

Summary

We haven't generated a summary for this paper yet.