Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MobileAIBench: Benchmarking LLMs and LMMs for On-Device Use Cases (2406.10290v1)

Published 12 Jun 2024 in cs.CL, cs.AI, and cs.LG

Abstract: The deployment of LLMs and Large Multimodal Models (LMMs) on mobile devices has gained significant attention due to the benefits of enhanced privacy, stability, and personalization. However, the hardware constraints of mobile devices necessitate the use of models with fewer parameters and model compression techniques like quantization. Currently, there is limited understanding of quantization's impact on various task performances, including LLM tasks, LMM tasks, and, critically, trust and safety. There is a lack of adequate tools for systematically testing these models on mobile devices. To address these gaps, we introduce MobileAIBench, a comprehensive benchmarking framework for evaluating mobile-optimized LLMs and LMMs. MobileAIBench assesses models across different sizes, quantization levels, and tasks, measuring latency and resource consumption on real devices. Our two-part open-source framework includes a library for running evaluations on desktops and an iOS app for on-device latency and hardware utilization measurements. Our thorough analysis aims to accelerate mobile AI research and deployment by providing insights into the performance and feasibility of deploying LLMs and LMMs on mobile platforms.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (18)
  1. Rithesh Murthy (12 papers)
  2. Liangwei Yang (46 papers)
  3. Juntao Tan (33 papers)
  4. Tulika Manoj Awalgaonkar (3 papers)
  5. Yilun Zhou (28 papers)
  6. Shelby Heinecke (37 papers)
  7. Sachin Desai (1 paper)
  8. Jason Wu (28 papers)
  9. Ran Xu (89 papers)
  10. Sarah Tan (21 papers)
  11. Jianguo Zhang (97 papers)
  12. Zhiwei Liu (114 papers)
  13. Shirley Kokane (9 papers)
  14. Zuxin Liu (43 papers)
  15. Ming Zhu (117 papers)
  16. Huan Wang (211 papers)
  17. Caiming Xiong (337 papers)
  18. Silvio Savarese (200 papers)
Citations (4)