Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Security Matrix for Multimodal Agents on Mobile Devices: A Systematic and Proof of Concept Study (2407.09295v2)

Published 12 Jul 2024 in cs.CR

Abstract: The rapid progress in the reasoning capability of the Multi-modal LLMs (MLLMs) has triggered the development of autonomous agent systems on mobile devices. MLLM-based mobile agent systems consist of perception, reasoning, memory, and multi-agent collaboration modules, enabling automatic analysis of user instructions and the design of task pipelines with only natural language and device screenshots as inputs. Despite the increased human-machine interaction efficiency, the security risks of MLLM-based mobile agent systems have not been systematically studied. Existing security benchmarks for agents mainly focus on Web scenarios, and the attack techniques against MLLMs are also limited in the mobile agent scenario. To close these gaps, this paper proposes a mobile agent security matrix covering 3 functional modules of the agent systems. Based on the security matrix, this paper proposes 4 realistic attack paths and verifies these attack paths through 8 attack methods. By analyzing the attack results, this paper reveals that MLLM-based mobile agent systems are not only vulnerable to multiple traditional attacks, but also raise new security concerns previously unconsidered. This paper highlights the need for security awareness in the design of MLLM-based systems and paves the way for future research on attacks and defense methods.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Yulong Yang (19 papers)
  2. Xinshan Yang (1 paper)
  3. Shuaidong Li (1 paper)
  4. Chenhao Lin (36 papers)
  5. Zhengyu Zhao (43 papers)
  6. Chao Shen (168 papers)
  7. Tianwei Zhang (199 papers)