Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Say What I Want: Towards the Dark Side of Neural Dialogue Models (1909.06044v3)

Published 13 Sep 2019 in cs.CL, cs.AI, and cs.LG

Abstract: Neural dialogue models have been widely adopted in various chatbot applications because of their good performance in simulating and generalizing human conversations. However, there exists a dark side of these models -- due to the vulnerability of neural networks, a neural dialogue model can be manipulated by users to say what they want, which brings in concerns about the security of practical chatbot services. In this work, we investigate whether we can craft inputs that lead a well-trained black-box neural dialogue model to generate targeted outputs. We formulate this as a reinforcement learning (RL) problem and train a Reverse Dialogue Generator which efficiently finds such inputs for targeted outputs. Experiments conducted on a representative neural dialogue model show that our proposed model is able to discover such desired inputs in a considerable portion of cases. Overall, our work reveals this weakness of neural dialogue models and may prompt further researches of developing corresponding solutions to avoid it.

Citations (16)

Summary

We haven't generated a summary for this paper yet.