Characterizing the adversarial vulnerability of speech self-supervised learning (2111.04330v2)

Published 8 Nov 2021 in cs.SD, cs.LG, and eess.AS

Abstract: A leaderboard named Speech processing Universal PERformance Benchmark (SUPERB), which aims at benchmarking the performance of a shared self-supervised learning (SSL) speech model across various downstream speech tasks with minimal modification of architectures and small amount of data, has fueled the research for speech representation learning. The SUPERB demonstrates speech SSL upstream models improve the performance of various downstream tasks through just minimal adaptation. As the paradigm of the self-supervised learning upstream model followed by downstream tasks arouses more attention in the speech community, characterizing the adversarial robustness of such paradigm is of high priority. In this paper, we make the first attempt to investigate the adversarial vulnerability of such paradigm under the attacks from both zero-knowledge adversaries and limited-knowledge adversaries. The experimental results illustrate that the paradigm proposed by SUPERB is seriously vulnerable to limited-knowledge adversaries, and the attacks generated by zero-knowledge adversaries are with transferability. The XAB test verifies the imperceptibility of crafted adversarial attacks.

PDF Abstract

The paper "Characterizing the adversarial vulnerability of speech self-supervised learning" explores the resilience of self-supervised learning (SSL) models for speech processing against adversarial attacks.

SSL has gained popularity in the speech community due to its ability to improve performance across various downstream tasks with minimal adjustments, as facilitated by initiatives like the Speech processing Universal PERformance Benchmark (SUPERB). This benchmark aims to assess SSL models' capabilities by leveraging small datasets and minimal architecture modifications.

The paper focuses on understanding how these SSL models withstand adversarial attacks, which are crafted to deceive the models into making incorrect predictions. Two types of adversaries are considered:

Zero-Knowledge Adversaries: These attackers have no specific information about the SSL model they are targeting. Despite this lack of knowledge, the attacks show a degree of transferability, implying that adversarial examples crafted for one model can potentially affect other models.
Limited-Knowledge Adversaries: These attackers possess some information about the model's architecture or data. The findings reveal that SSL models in the SUPERB paradigm are particularly vulnerable to attacks from limited-knowledge adversaries. This vulnerability indicates that even partial knowledge about the model significantly increases the effectiveness of adversarial attacks.

The paper also includes an XAB test, a perceptual test designed to ensure that the adversarial examples remain imperceptible to human listeners. This ensures that the attacks are not only effective but also disguised sufficiently to evade detection by human observers.

Overall, the research highlights significant vulnerabilities in speech SSL paradigms, pointing to the need for further exploration of robust defense mechanisms to enhance the security of these models in real-world applications.

PDF Markdown Bookmark Chat (Pro)

Authors (6)

Haibin Wu (85 papers)
Bo Zheng (205 papers)
Xu Li (126 papers)
Xixin Wu (85 papers)
Hung-yi Lee (327 papers)
Helen Meng (204 papers)

Citations (7)

View on Semantic Scholar

YouTube

Show All Videos

Characterizing the adversarial vulnerability of speech self-supervised learning (2111.04330v2)

Related Papers

YouTube