Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
153 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Toward Runtime-Throttleable Neural Networks (1905.13179v1)

Published 30 May 2019 in cs.LG and stat.ML

Abstract: As deep neural network (NN) methods have matured, there has been increasing interest in deploying NN solutions to "edge computing" platforms such as mobile phones or embedded controllers. These platforms are often resource-constrained, especially in energy storage and power, but state-of-the-art NN architectures are designed with little regard for resource use. Existing techniques for reducing the resource footprint of NN models produce static models that occupy a single point in the trade-space between performance and resource use. This paper presents an approach to creating runtime-throttleable NNs that can adaptively balance performance and resource use in response to a control signal. Throttleable networks allow intelligent resource management, for example by allocating fewer resources in "easy" conditions or when battery power is low. We describe a generic formulation of throttling via block-level gating, apply it to create throttleable versions of several standard CNN architectures, and demonstrate that our approach allows smooth performance throttling over a wide range of operating points in image classification and object detection tasks, with only a small loss in peak accuracy.

Citations (2)

Summary

We haven't generated a summary for this paper yet.