Papers
Topics
Authors
Recent
2000 character limit reached

NestDNN: Resource-Aware Multi-Tenant On-Device Deep Learning for Continuous Mobile Vision

Published 23 Oct 2018 in cs.CV | (1810.10090v1)

Abstract: Mobile vision systems such as smartphones, drones, and augmented-reality headsets are revolutionizing our lives. These systems usually run multiple applications concurrently and their available resources at runtime are dynamic due to events such as starting new applications, closing existing applications, and application priority changes. In this paper, we present NestDNN, a framework that takes the dynamics of runtime resources into account to enable resource-aware multi-tenant on-device deep learning for mobile vision systems. NestDNN enables each deep learning model to offer flexible resource-accuracy trade-offs. At runtime, it dynamically selects the optimal resource-accuracy trade-off for each deep learning model to fit the model's resource demand to the system's available runtime resources. In doing so, NestDNN efficiently utilizes the limited resources in mobile vision systems to jointly maximize the performance of all the concurrently running applications. Our experiments show that compared to the resource-agnostic status quo approach, NestDNN achieves as much as 4.2% increase in inference accuracy, 2.0x increase in video frame processing rate and 1.7x reduction on energy consumption.

Citations (253)

Summary

  • The paper introduces NestDNN, a framework enabling resource-aware multi-tenant on-device deep learning for continuous mobile vision by dynamically adapting models to available resources.
  • NestDNN uses an offline stage involving filter pruning and a novel freeze-content-grow recovery to create multi-capacity models that offer various resource-accuracy trade-offs within a single structure.
  • Experimental results show NestDNN achieves up to a 4.2% accuracy increase, doubles video frame processing rate, and reduces energy consumption by 1.7 times compared to resource-agnostic solutions, supporting deployment across diverse mobile devices.

NestDNN: Resource-Aware Multi-Tenant On-Device Deep Learning for Continuous Mobile Vision

The presented paper offers a comprehensive exploration into NestDNN, a framework designed to enable resource-aware multi-tenant on-device deep learning specifically tailored for mobile vision systems. As mobile devices increasingly integrate capabilities for concurrent application operations, achieving efficient, continuous processing of complex deep learning tasks on-device becomes paramount. This paper addresses the challenges inherent in the dynamic resource conditions that these mobile systems regularly encounter, while skillfully optimizing resource allocation and maintaining high inference accuracies.

Key Aspects of NestDNN

The NestDNN framework is built to facilitate the dynamic allocation of resources dictated by runtime exigencies of mobile vision systems. It advances beyond traditional static modulation frameworks by introducing flexibility in the optimization of resource-accuracy trade-offs for deep learning models operating on-device. This accommodation is achieved by dynamically selecting the optimal trade-offs for each model vis-a-vis the system’s current resource capability.

Fundamentally, NestDNN encompasses two distinct stages: offline and online. In the offline stage, models undergo a strategic pruning process based on their filters' importance, followed by a recovery process utilizing a novel "freeze-content-grow" method. This dual-phase procedure effectively yields a multi-capacity model that incorporates various resource-accuracy trade-offs within a single model, ensuring compact memory footprint and efficient resource usage.

Numerical Results and Implementation

Experiments outlined reveal that NestDNN demonstrates notable improvements over existing resource-agnostic solutions, achieving up to a 4.2% increase in inference accuracy, doubling the video frame processing rate, and reducing energy consumption by 1.7 times. These enhancements underscore the efficacy of the flexible models, where pruning and recovery steps adeptly generate optimized resource-accuracy calibrations.

Such advancements are showcased through six diverse mobile vision applications utilizing datasets like CIFAR-10 and ImageNet. The implementation of NestDNN on a variety of smartphones, including Samsung Galaxy S8, indicates the framework's robustness across hardware with varying resource capabilities.

Implications and Future Directions

The implications of these findings are significant for the future of on-device deep learning. Practically, NestDNN enhances the viability of deploying multiple, high-order computer vision applications on mobile devices locally, without necessitating constant cloud interaction. This capacity could, in turn, translate into more efficient, real-time dose calculations in autonomous systems or more responsive augmented reality setups.

In theoretical terms, this framework lays a foundation for further exploration into dynamic neural networks, paving the way for more versatile AI models that can thrive within the constraints of mobile system environments. As mobile computing continues to evolve, the framework's ability to balance resource constraints with performance will undoubtedly be pivotal in the proliferation of real-time AI-enhanced mobile applications.

Future advances should consider extending NestDNN across other deep learning architectures beyond VGG and ResNet. Furthermore, examining the integration of more sophisticated runtime scheduling algorithms could offer additional performance enhancements.

In conclusion, NestDNN establishes a pivotal approach to multi-tenant on-device deep learning, addressing critical scalability demands of mobile systems and representing a promising direction for achieving continuous mobile vision in a dynamically evolving technological landscape.

Whiteboard

Paper to Video (Beta)

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (3)

Collections

Sign up for free to add this paper to one or more collections.