The National Research Platform: Stretched, Multi-Tenant, Scientific Kubernetes Cluster (2505.22864v1)
Abstract: The National Research Platform (NRP) represents a distributed, multi-tenant Kubernetes-based cyberinfrastructure designed to facilitate collaborative scientific computing. Spanning over 75 locations in the U.S. and internationally, the NRP uniquely integrates varied computational resources, ranging from single nodes to extensive GPU and CPU clusters, to support diverse research workloads including advanced AI and machine learning tasks. It emphasizes flexibility through user-friendly interfaces such as JupyterHub and low level control of resources through direct Kubernetes interaction. Critical operational insights are discussed, including security enhancements using Kubernetes-integrated threat detection, extensive monitoring, and comprehensive accounting systems. This paper highlights the NRP's growing importance and scalability in addressing the increasing demands for distributed scientific computational resources.