Cooling Matters: Benchmarking Large Language Models and Vision-Language Models on Liquid-Cooled Versus Air-Cooled H100 GPU Systems (2507.16781v1)
Abstract: The unprecedented growth in AI workloads, recently dominated by LLMs and vision-LLMs (VLMs), has intensified power and cooling demands in data centers. This study benchmarks LLMs and VLMs on two HGX nodes, each with 8x NVIDIA H100 graphics processing units (GPUs), using liquid and air cooling. Leveraging GPU Burn, Weights and Biases, and IPMItool, we collect detailed thermal, power, and computation data. Results show that the liquid-cooled systems maintain GPU temperatures between 41-50 degrees Celsius, while the air-cooled counterparts fluctuate between 54-72 degrees Celsius under load. This thermal stability of liquid-cooled systems yields 17 percent higher performance (54 TFLOPs per GPU vs. 46 TFLOPs per GPU), improved performance per watt, reduced energy overhead, and greater system efficiency than the air-cooled counterparts. These findings underscore the energy and sustainability benefits of liquid cooling, offering a compelling path forward for hyperscale data centers s
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.