Muting Whisper: A Universal Acoustic Adversarial Attack on Speech Foundation Models (2405.06134v2)
Abstract: Recent developments in large speech foundation models like Whisper have led to their widespread use in many automatic speech recognition (ASR) applications. These systems incorporate special tokens' in their vocabulary, such as $\texttt{<|endoftext|>}$, to guide their language generation process. However, we demonstrate that these tokens can be exploited by adversarial attacks to manipulate the model's behavior. We propose a simple yet effective method to learn a universal acoustic realization of Whisper's $\texttt{<|endoftext|>}$ token, which, when prepended to any speech signal, encourages the model to ignore the speech and only transcribe the special token, effectively
muting' the model. Our experiments demonstrate that the same, universal 0.64-second adversarial audio segment can successfully mute a target Whisper ASR model for over 97\% of speech samples. Moreover, we find that this universal adversarial audio segment often transfers to new datasets and tasks. Overall this work demonstrates the vulnerability of Whisper models to `muting' adversarial attacks, where such attacks can pose both risks and potential benefits in real-world settings: for example the attack can be used to bypass speech moderation systems, or conversely the attack can also be used to protect private speech data.
- Practical hidden voice attacks against speech and speaker recognition systems.
- Did you hear that? adversarial examples against automatic speech recognition. CoRR, abs/1801.00554.
- Common Voice: A Massively-Multilingual Speech Corpus. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 4218–4222.
- The MGB challenge: Evaluating multi-genre broadcast media recognition. In 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pages 687–693. IEEE.
- Hidden voice commands. In 25th USENIX Security Symposium (USENIX Security 16), pages 513–530, Austin, TX. USENIX Association.
- Nicholas Carlini and David A. Wagner. 2018. Audio adversarial examples: Targeted attacks on speech-to-text. CoRR, abs/1801.01944.
- Devil’s whisper: A general approach for physical adversarial attacks against commercial black-box speech recognition devices. In 29th USENIX Security Symposium (USENIX Security 20), pages 2667–2684. USENIX Association.
- Uniap: Protecting speech privacy with non-targeted universal adversarial perturbations. IEEE Transactions on Dependable and Secure Computing, 21(01):31–46.
- Houdini: Fooling deep structured prediction models.
- Fleurs: Few-shot learning evaluation of universal representations of speech. arXiv preprint arXiv:2205.12446.
- ADAGIO: interactive experimentation with adversarial attack and defense for audio. CoRR, abs/1805.11852.
- Sirenattack: Generating adversarial audio for end-to-end acoustic systems.
- A practical black-box attack against autonomous speech recognition model. In GLOBECOM 2020 - 2020 IEEE Global Communications Conference, pages 1–6.
- Yuan Gong and Christian Poellabauer. 2017. Crafting adversarial examples for speech paralinguistics applications. CoRR, abs/1711.03280.
- TED-LIUM 3: Twice as much data and corpus repartition for experiments on speaker adaptation. In Speech and Computer: 20th International Conference, SPECOM 2018, Leipzig, Germany, September 18–22, 2018, Proceedings 20, pages 198–208. Springer.
- Adversarial black-box attacks on automatic speech recognition systems using multi-objective evolutionary optimization.
- Diederik Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In International Conference on Learning Representations (ICLR), San Diega, CA, USA.
- Advpulse: Universal, synchronization-free, and targeted audio adversarial attacks via subsecond perturbations. In Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security, CCS ’20, page 1121–1134, New York, NY, USA. Association for Computing Machinery.
- Exploring targeted universal adversarial perturbations to end-to-end asr models.
- Simulating unknown target models for query-efficient black-box attacks.
- Hate speech detection: Challenges and solutions. PloS one, 14(8):e0221152.
- Towards deep learning models resistant to adversarial attacks.
- Artie bias corpus: An open dataset for detecting demographic bias in speech applications. In Proceedings of the twelfth language resources and evaluation conference, pages 6462–6468.
- Universal adversarial perturbations for speech recognition systems. CoRR, abs/1905.03828.
- Raphael Olivier and Bhiksha Raj. 2023. There is more than one kind of robustness: Fooling whisper with adversarial examples.
- Librispeech: an ASR corpus based on public domain audio books. In 2015 IEEE international conference on acoustics, speech and signal processing (ICASSP), pages 5206–5210. IEEE.
- Imperceptible, robust, and targeted adversarial examples for automatic speech recognition.
- Robust speech recognition via large-scale weak supervision.
- Robust speech recognition via large-scale weak supervision. In International Conference on Machine Learning, pages 28492–28518. PMLR.
- Universal adversarial attacks on spoken language assessment systems. Interspeech.
- Adversarial attacks against automatic speech recognition systems via psychoacoustic hiding. ArXiv, abs/1808.05665.
- Adversarial attacks against automatic speech recognition systems via psychoacoustic hiding.
- Targeted adversarial examples for black box audio systems.
- Ching Seh Wu and Unnathi Bhandary. 2020. Detection of hate speech in videos using machine learning. In 2020 International Conference on Computational Science and Computational Intelligence (CSCI), pages 585–590. IEEE.
- Commandersong: A systematic approach for practical adversarial voice recognition.
- Dolphinatack: Inaudible voice commands. CoRR, abs/1708.09537.
- Black-box adversarial attacks on commercial speech platforms with minimal information. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, CCS ’21. ACM.
- Vyas Raina (18 papers)
- Rao Ma (22 papers)
- Charles McGhee (2 papers)
- Kate Knill (11 papers)
- Mark Gales (52 papers)