2000 character limit reached
Clover: Toward Sustainable AI with Carbon-Aware Machine Learning Inference Service (2304.09781v2)
Published 19 Apr 2023 in cs.DC
Abstract: This paper presents a solution to the challenge of mitigating carbon emissions from hosting large-scale ML inference services. ML inference is critical to modern technology products, but it is also a significant contributor to carbon footprint. We introduce Clover, a carbon-friendly ML inference service runtime system that balances performance, accuracy, and carbon emissions through mixed-quality models and GPU resource partitioning. Our experimental results demonstrate that Clover is effective in substantially reducing carbon emissions while maintaining high accuracy and meeting service level agreement (SLA) targets.