EdgeSight: Enabling Modeless and Cost-Efficient Inference at the Edge (2405.19213v2)

Published 29 May 2024 in eess.SY, cs.AI, cs.LG, cs.NI, and cs.SY

Abstract: Traditional ML inference is evolving toward modeless inference, which abstracts the complexity of model selection from users, allowing the system to automatically choose the most appropriate model for each request based on accuracy and resource requirements. While prior studies have focused on modeless inference within data centers, this paper tackles the pressing need for cost-efficient modeless inference at the edge -- particularly within its unique constraints of limited device memory, volatile network conditions, and restricted power consumption. To overcome these challenges, we propose EdgeSight, a system that provides cost-efficient EdgeSight serving for diverse DNNs at the edge. EdgeSight employs an edge-data center (edge-DC) architecture, utilizing confidence scaling to reduce the number of model options while meeting diverse accuracy requirements. Additionally, it supports lossy inference in volatile network environments. Our experimental results show that EdgeSight outperforms existing systems by up to 1.6x in P99 latency for modeless services. Furthermore, our FPGA prototype demonstrates similar performance at certain accuracy levels, with a power consumption reduction of up to 3.34x.

References (81)

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Tweets

https://twitter.com/USB/status/1796095116530057286

https://twitter.com/USB/status/1880100906617758067

EdgeSight: Enabling Modeless and Cost-Efficient Inference at the Edge (2405.19213v2)

Summary

Related Papers

Tweets