Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Intelligence Beyond the Edge: Inference on Intermittent Embedded Systems (1810.07751v2)

Published 28 Sep 2018 in cs.DC

Abstract: Energy-harvesting technology provides a promising platform for future IoT applications. However, since communication is very expensive in these devices, applications will require inference "beyond the edge" to avoid wasting precious energy on pointless communication. We show that application performance is highly sensitive to inference accuracy. Unfortunately, accurate inference requires large amounts of computation and memory, and energy-harvesting systems are severely resource-constrained. Moreover, energy-harvesting systems operate intermittently, suffering frequent power failures that corrupt results and impede forward progress. This paper overcomes these challenges to present the first full-scale demonstration of DNN inference on an energy-harvesting system. We design and implement SONIC, an intermittence-aware software system with specialized support for DNN inference. SONIC introduces loop continuation, a new technique that dramatically reduces the cost of guaranteeing correct intermittent execution for loop-heavy code like DNN inference. To build a complete system, we further present GENESIS, a tool that automatically compresses networks to optimally balance inference accuracy and energy, and TAILS, which exploits SIMD hardware available in some microcontrollers to improve energy efficiency. Both SONIC & TAILS guarantee correct intermittent execution without any hand-tuning or performance loss across different power systems. Across three neural networks on a commercially available microcontroller, SONIC & TAILS reduce inference energy by 6.9x and 12.2x, respectively, over the state-of-the-art.

Citations (190)

Summary

  • The paper introduces a system that enables continuous DNN inference across power interruptions using loop continuation to eliminate redundant computation.
  • The research applies network compression techniques to balance inference accuracy and energy consumption, achieving energy savings up to 12.2× over prior methods.
  • The study shows that integrating hardware accelerators with optimized software can empower resource-constrained IoT devices with local intelligence.

Efficient Deep Neural Network Inference on Intermittent Energy-Harvesting Systems

The paper "Intelligence Beyond the Edge: Inference on Intermittent Embedded Systems" presents a pioneering exploration into enabling deep neural network (DNN) inference on resource-constrained, energy-harvesting devices. The research delves deeply into the challenges these systems face, including their need to operate intermittently due to limited and discontinuous energy supplies, and their restricted computational and memory capacities. The work described in this paper provides a comprehensive solution that enhances both the feasibility and efficiency of running DNNs on such platforms.

The authors begin by establishing the necessity for inference on energy-harvesting systems, particularly for Internet of Things (IoT) applications where communication with a cloud infrastructure can be prohibitively expensive in terms of energy. The paper outlines an energy model, demonstrating that performance is closely tied to the accuracy of local inference, and proposes the use of DNNs despite their computational demands, due to their superior accuracy over traditional methods.

The central contribution of the paper is the software system, , which is designed to enable intermittent execution of DNNs efficiently. achieves this by introducing specialized techniques such as loop continuation, which allows for continuity in computation across intermittent power cycles without the typical overhead seen in prior systems. Specifically, loop continuation avoids the redundant work and energy waste associated with restarting computations, by ensuring that computation can pick up exactly where it left off following a power cut.

To further optimize performance, the authors introduce , a tool that applies known network compression techniques such as separation and pruning to reduce resource requirements and maximize inference efficiency. performs an automatic optimization to balance the trade-off between inference accuracy and energy consumption, ultimately selecting the configuration that maximizes performance in terms of interesting messages per Joule (IMpJ). This balance is crucial in energy-harvesting systems where computational efficiency is paramount.

The system also benefits from a hardware integration module, , which exploits available hardware features, such as the Texas Instruments Low Energy Accelerator (LEA), to facilitate efficient DNN computation with direct memory access. This integration allows for adaptive adjustments to hardware settings to maximize throughput under the device’s energy constraints.

Quantitatively, the systems show substantial improvements over previous state-of-the-art intermittent systems such as Alpaca. They demonstrate reduced inference energy usage by factors of 6.9×6.9\times and 12.2×12.2\times, and an ability to complete inference tasks across various neural network models on different energy storage configurations.

This paper has several implications for the future of AI and IoT applications. Practically, it enables more sophisticated processing on batteryless devices, opening opportunities for a wide range of IoT applications that require local intelligence. The theoretical contribution lies in proving that correct, efficient intermittent computing is possible through careful design of software systems and resource-aware network compression. Future work might explore the integration of more specialized hardware accelerators to further enhance energy efficiency, or adapt the approach for even more constrained environments or complex models.

In conclusion, this work demonstrates the feasibility of using DNNs in energy-harvesting environments and charts a path toward more efficient intermittent systems. As IoT continues to proliferate, the methodologies and insights from this paper will be indispensable for deploying AI at the edge where power resources are limited.

Youtube Logo Streamline Icon: https://streamlinehq.com