HQA-Attack: Toward High Quality Black-Box Hard-Label Adversarial Attack on Text (2402.01806v1)

Published 2 Feb 2024 in cs.CL and cs.AI

Abstract: Black-box hard-label adversarial attack on text is a practical and challenging task, as the text data space is inherently discrete and non-differentiable, and only the predicted label is accessible. Research on this problem is still in the embryonic stage and only a few methods are available. Nevertheless, existing methods rely on the complex heuristic algorithm or unreliable gradient estimation strategy, which probably fall into the local optimum and inevitably consume numerous queries, thus are difficult to craft satisfactory adversarial examples with high semantic similarity and low perturbation rate in a limited query budget. To alleviate above issues, we propose a simple yet effective framework to generate high quality textual adversarial examples under the black-box hard-label attack scenarios, named HQA-Attack. Specifically, after initializing an adversarial example randomly, HQA-attack first constantly substitutes original words back as many as possible, thus shrinking the perturbation rate. Then it leverages the synonym set of the remaining changed words to further optimize the adversarial example with the direction which can improve the semantic similarity and satisfy the adversarial condition simultaneously. In addition, during the optimizing procedure, it searches a transition synonym word for each changed word, thus avoiding traversing the whole synonym set and reducing the query number to some extent. Extensive experimental results on five text classification datasets, three natural language inference datasets and two real-world APIs have shown that the proposed HQA-Attack method outperforms other strong baselines significantly.

References (45)

Authors (8)

Han Liu (340 papers)
Zhi Xu (53 papers)
Xiaotong Zhang (28 papers)
Feng Zhang (180 papers)
Fenglong Ma (66 papers)
Hongyang Chen (61 papers)
Hong Yu (114 papers)
Xianchao Zhang (15 papers)

Citations (3)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

HQA-Attack: Toward High Quality Black-Box Hard-Label Adversarial Attack on Text (2402.01806v1)

Summary

Related Papers