Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

OODIn: An Optimised On-Device Inference Framework for Heterogeneous Mobile Devices (2106.04723v1)

Published 8 Jun 2021 in cs.LG and cs.CV

Abstract: Radical progress in the field of deep learning (DL) has led to unprecedented accuracy in diverse inference tasks. As such, deploying DL models across mobile platforms is vital to enable the development and broad availability of the next-generation intelligent apps. Nevertheless, the wide and optimised deployment of DL models is currently hindered by the vast system heterogeneity of mobile devices, the varying computational cost of different DL models and the variability of performance needs across DL applications. This paper proposes OODIn, a framework for the optimised deployment of DL apps across heterogeneous mobile devices. OODIn comprises a novel DL-specific software architecture together with an analytical framework for modelling DL applications that: (1) counteract the variability in device resources and DL models by means of a highly parametrised multi-layer design; and (2) perform a principled optimisation of both model- and system-level parameters through a multi-objective formulation, designed for DL inference apps, in order to adapt the deployment to the user-specified performance requirements and device capabilities. Quantitative evaluation shows that the proposed framework consistently outperforms status-quo designs across heterogeneous devices and delivers up to 4.3x and 3.5x performance gain over highly optimised platform- and model-aware designs respectively, while effectively adapting execution to dynamic changes in resource availability.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Stylianos I. Venieris (42 papers)
  2. Ioannis Panopoulos (5 papers)
  3. Iakovos S. Venieris (13 papers)
Citations (13)

Summary

We haven't generated a summary for this paper yet.