DolphinAtack: Inaudible Voice Commands (1708.09537v1)

Published 31 Aug 2017 in cs.CR

Abstract: Speech recognition (SR) systems such as Siri or Google Now have become an increasingly popular human-computer interaction method, and have turned various systems into voice controllable systems(VCS). Prior work on attacking VCS shows that the hidden voice commands that are incomprehensible to people can control the systems. Hidden voice commands, though hidden, are nonetheless audible. In this work, we design a completely inaudible attack, DolphinAttack, that modulates voice commands on ultrasonic carriers (e.g., f > 20 kHz) to achieve inaudibility. By leveraging the nonlinearity of the microphone circuits, the modulated low frequency audio commands can be successfully demodulated, recovered, and more importantly interpreted by the speech recognition systems. We validate DolphinAttack on popular speech recognition systems, including Siri, Google Now, Samsung S Voice, Huawei HiVoice, Cortana and Alexa. By injecting a sequence of inaudible voice commands, we show a few proof-of-concept attacks, which include activating Siri to initiate a FaceTime call on iPhone, activating Google Now to switch the phone to the airplane mode, and even manipulating the navigation system in an Audi automobile. We propose hardware and software defense solutions. We validate that it is feasible to detect DolphinAttack by classifying the audios using supported vector machine (SVM), and suggest to re-design voice controllable systems to be resilient to inaudible voice command attacks.

PDF Abstract

DolphinAttack: Inaudible Voice Commands

The paper "DolphinAttack: Inaudible Voice Commands" presents a novel method for exploiting the vulnerabilities in speech recognition (SR) systems such as Siri, Google Now, and Alexa by using inaudible voice commands modulated onto ultrasonic carriers. This attack leverages the nonlinearity of microphone circuits, allowing the recovery and interpretation of voice commands by SR systems without being detected by human listeners.

Research Overview

Recent advancements in speech recognition technology have made voice-controllable systems (VCS) an integral part of various applications. Despite improvements in accuracy and usability, the security of these systems against malicious attacks remains a complex challenge. Previous research identified hidden voice commands that are incomprehensible to humans but effective against SR systems. However, these commands, although obfuscated, are still audible.

The DolphinAttack advances this by rendering the voice commands completely inaudible to the human ear. This is achieved by modulating the voice commands onto ultrasonic frequencies above 20 kHz. The attack relies on the nonlinearity in microphone circuitry, which facilitates the demodulation and subsequent interpretation by SR systems. The paper validates the attack across major SR systems and a wide range of devices, including smartphones, laptops, smart home devices, and vehicles.

Experimental Validation

The paper meticulously details the construction of the DolphinAttack, using both a benchtop setup for controlled testing and a more portable implementation for practical scenarios. Critical parameters affecting the attack, such as modulation depth and carrier frequency, are exhaustively analyzed across different devices and systems. The researchers confirm the success of the attack on various platforms, underlining its feasibility and potential impact.

The experiments reveal a range of practical implications:

Activation of SR systems without user consent.
Execution of unauthorized operations such as initiating calls or changing device settings.
Use of common TTS systems for generating activation and control commands, highlighting the generic nature of the attack.

Implications and Defense Strategies

The findings raise significant concerns about the inherent vulnerabilities in current VCS designs, which assume adversaries can only manipulate systems vocally within the audible range. The potential for unnoticed security breaches underscores the need for re-evaluation of current defense mechanisms in SR systems.

The paper proposes hardware-based defenses, such as enhancing microphone designs to suppress ultrasonic frequencies and software-based approaches like machine learning classifiers to detect unusual signal patterns indicative of DolphinAttack. These strategies aim to mitigate the identified vulnerabilities and enhance the resilience of VCS against inaudible command injections.

Future Developments

The practical implications of DolphinAttack extend to varied domains, suggesting several avenues for future research:

Development of more robust detection algorithms to identify anomalies in captured audio signals.
Exploration of adaptive defense mechanisms that can dynamically respond to evolving attack strategies.
Examination of the broader applicability of ultrasonic modulation techniques in securing SR systems.

Conclusion

The paper rigorously demonstrates the feasibility of injecting inaudible voice commands into SR systems and elucidates the significant security shortcomings in widely-used devices. The DolphinAttack serves as a critical reminder of the evolving landscape of security threats to autonomous systems, urging continued innovation in defense methodologies to protect against emerging attack vectors.

PDF Markdown Bookmark Chat (Pro)

Authors (6)

Guoming Zhang (9 papers)
Chen Yan (25 papers)
Xiaoyu Ji (19 papers)
Taimin Zhang (1 paper)
Tianchen Zhang (6 papers)
Wenyuan Xu (35 papers)

Citations (675)

View on Semantic Scholar

Related Papers

Find Related Papers

YouTube

Show All Videos