Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

ToxicSQL: Migrating SQL Injection Threats into Text-to-SQL Models via Backdoor Attack (2503.05445v2)

Published 7 Mar 2025 in cs.CR and cs.DB

Abstract: LLMs have shown state-of-the-art results in translating natural language questions into SQL queries (Text-to-SQL), a long-standing challenge within the database community. However, security concerns remain largely unexplored, particularly the threat of backdoor attacks, which can introduce malicious behaviors into models through fine-tuning with poisoned datasets. In this work, we systematically investigate the vulnerabilities of LLM-based Text-to-SQL models and present ToxicSQL, a novel backdoor attack framework. Our approach leverages stealthy {semantic and character-level triggers} to make backdoors difficult to detect and remove, ensuring that malicious behaviors remain covert while maintaining high model accuracy on benign inputs. Furthermore, we propose leveraging SQL injection payloads as backdoor targets, enabling the generation of malicious yet executable SQL queries, which pose severe security and privacy risks in LLM-based SQL development. We demonstrate that injecting only 0.44% of poisoned data can result in an attack success rate of 79.41%, posing a significant risk to database security. Additionally, we propose detection and mitigation strategies to enhance model reliability. Our findings highlight the urgent need for security-aware Text-to-SQL development, emphasizing the importance of robust defenses against backdoor threats.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Meiyu Lin (3 papers)
  2. Haichuan Zhang (11 papers)
  3. Jiale Lao (3 papers)
  4. Renyuan Li (7 papers)
  5. Yuanchun Zhou (62 papers)
  6. Carl Yang (130 papers)
  7. Yang Cao (295 papers)
  8. Mingjie Tang (22 papers)

Summary

We haven't generated a summary for this paper yet.