Attacker's Noise Can Manipulate Your Audio-based LLM in the Real World (2507.06256v1)

Published 7 Jul 2025 in cs.CR, cs.AI, cs.SD, and eess.AS

Abstract: This paper investigates the real-world vulnerabilities of audio-based LLMs (ALLMs), such as Qwen2-Audio. We first demonstrate that an adversary can craft stealthy audio perturbations to manipulate ALLMs into exhibiting specific targeted behaviors, such as eliciting responses to wake-keywords (e.g., "Hey Qwen"), or triggering harmful behaviors (e.g. "Change my calendar event"). Subsequently, we show that playing adversarial background noise during user interaction with the ALLMs can significantly degrade the response quality. Crucially, our research illustrates the scalability of these attacks to real-world scenarios, impacting other innocent users when these adversarial noises are played through the air. Further, we discuss the transferrability of the attack, and potential defensive measures.

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Tweets

https://twitter.com/chaumian/status/1943132067585233146

https://twitter.com/imVinusankars/status/1943444822020493824

Attacker's Noise Can Manipulate Your Audio-based LLM in the Real World (2507.06256v1)

Summary

Related Papers

Tweets