Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

EmBench: Quantifying Performance Variations of Deep Neural Networks across Modern Commodity Devices (1905.07346v1)

Published 17 May 2019 in cs.LG, cs.PF, and stat.ML

Abstract: In recent years, advances in deep learning have resulted in unprecedented leaps in diverse tasks spanning from speech and object recognition to context awareness and health monitoring. As a result, an increasing number of AI-enabled applications are being developed targeting ubiquitous and mobile devices. While deep neural networks (DNNs) are getting bigger and more complex, they also impose a heavy computational and energy burden on the host devices, which has led to the integration of various specialized processors in commodity devices. Given the broad range of competing DNN architectures and the heterogeneity of the target hardware, there is an emerging need to understand the compatibility between DNN-platform pairs and the expected performance benefits on each platform. This work attempts to demystify this landscape by systematically evaluating a collection of state-of-the-art DNNs on a wide variety of commodity devices. In this respect, we identify potential bottlenecks in each architecture and provide important guidelines that can assist the community in the co-design of more efficient DNNs and accelerators.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Mario Almeida (9 papers)
  2. Stefanos Laskaridis (20 papers)
  3. Ilias Leontiadis (29 papers)
  4. Stylianos I. Venieris (42 papers)
  5. Nicholas D. Lane (97 papers)
Citations (69)