Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
GPT-4o
Gemini 2.5 Pro Pro
o3 Pro
GPT-4.1 Pro
DeepSeek R1 via Azure Pro
2000 character limit reached

PromptShield: Deployable Detection for Prompt Injection Attacks (2501.15145v2)

Published 25 Jan 2025 in cs.CR

Abstract: Application designers have moved to integrate LLMs into their products. However, many LLM-integrated applications are vulnerable to prompt injections. While attempts have been made to address this problem by building prompt injection detectors, many are not yet suitable for practical deployment. To support research in this area, we introduce PromptShield, a benchmark for training and evaluating deployable prompt injection detectors. Our benchmark is carefully curated and includes both conversational and application-structured data. In addition, we use insights from our curation process to fine-tune a new prompt injection detector that achieves significantly higher performance in the low false positive rate (FPR) evaluation regime compared to prior schemes. Our work suggests that careful curation of training data and larger models can contribute to strong detector performance.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.