Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

ToolSword: Unveiling Safety Issues of Large Language Models in Tool Learning Across Three Stages (2402.10753v2)

Published 16 Feb 2024 in cs.CL and cs.AI

Abstract: Tool learning is widely acknowledged as a foundational approach or deploying LLMs in real-world scenarios. While current research primarily emphasizes leveraging tools to augment LLMs, it frequently neglects emerging safety considerations tied to their application. To fill this gap, we present ToolSword, a comprehensive framework dedicated to meticulously investigating safety issues linked to LLMs in tool learning. Specifically, ToolSword delineates six safety scenarios for LLMs in tool learning, encompassing malicious queries and jailbreak attacks in the input stage, noisy misdirection and risky cues in the execution stage, and harmful feedback and error conflicts in the output stage. Experiments conducted on 11 open-source and closed-source LLMs reveal enduring safety challenges in tool learning, such as handling harmful queries, employing risky tools, and delivering detrimental feedback, which even GPT-4 is susceptible to. Moreover, we conduct further studies with the aim of fostering research on tool learning safety. The data is released in https://github.com/Junjie-Ye/ToolSword.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Junjie Ye (66 papers)
  2. Sixian Li (12 papers)
  3. Guanyu Li (10 papers)
  4. Caishuang Huang (13 papers)
  5. Songyang Gao (28 papers)
  6. Yilong Wu (11 papers)
  7. Qi Zhang (784 papers)
  8. Tao Gui (127 papers)
  9. Xuanjing Huang (287 papers)
Citations (10)
X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets