Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Lost In Translation: Generating Adversarial Examples Robust to Round-Trip Translation (2307.12520v1)

Published 24 Jul 2023 in cs.CL and cs.LG

Abstract: LLMs today provide a high accuracy across a large number of downstream tasks. However, they remain susceptible to adversarial attacks, particularly against those where the adversarial examples maintain considerable similarity to the original text. Given the multilingual nature of text, the effectiveness of adversarial examples across translations and how machine translations can improve the robustness of adversarial examples remain largely unexplored. In this paper, we present a comprehensive study on the robustness of current text adversarial attacks to round-trip translation. We demonstrate that 6 state-of-the-art text-based adversarial attacks do not maintain their efficacy after round-trip translation. Furthermore, we introduce an intervention-based solution to this problem, by integrating Machine Translation into the process of adversarial example generation and demonstrating increased robustness to round-trip translation. Our results indicate that finding adversarial examples robust to translation can help identify the insufficiency of LLMs that is common across languages, and motivate further research into multilingual adversarial attacks.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Neel Bhandari (4 papers)
  2. Pin-Yu Chen (311 papers)
Citations (2)

Summary

We haven't generated a summary for this paper yet.