Big Data at HPC Wales (1506.08907v1)
Abstract: This paper describes an automated approach to handling Big Data workloads on HPC systems. We describe a solution that dynamically creates a unified cluster based on YARN in an HPC Environment, without the need to configure and allocate a dedicated Hadoop cluster. The end user can choose to write the solution in any combination of supported frameworks, a solution that scales seamlessly from a few cores to thousands of cores. This coupling of environments creates a platform for applications to utilize the native HPC solutions along with the Big Data Frameworks. The user will be provided with HPC Wales APIs in multiple languages that will let them integrate this flow into their environment, thereby ensuring that the traditional means of HPC access do not become a bottleneck. We describe the behavior of the cluster creation and performance results on Terasort.